Skip to content

Commit

Permalink
perf: simplify boolean expression rules (#3731)
Browse files Browse the repository at this point in the history
This PR adds the following:
1. `simplify` now recursively simplifies the entire expression tree
2. Simplification rules `simplify_and` and `simplify_or` to simplify
boolean expressions
3. A simplify expressions step to the logical optimizer before scan task
materialization

I also did some refactoring work that did not change behavior:
1. Converted various functions to take in `Arc<T>` instead of `T`, which is
more ergonomic to use and avoids a potential clone when calling it with
an Arc variable using `Arc::unwrap_or_clone`
2. Split up simplify code into multiple files

Combined with #3728, this PR brings major performance improvements to
TPCH Q19 by pushing down and simplifying filter predicates.
  • Loading branch information
kevinzwang authored Jan 29, 2025
1 parent 4048816 commit 78beff4
Show file tree
Hide file tree
Showing 11 changed files with 806 additions and 563 deletions.
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

31 changes: 31 additions & 0 deletions src/common/treenode/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -736,6 +736,20 @@ pub trait TreeNodeIterator: Iterator {
self,
f: F,
) -> Result<Transformed<Vec<Self::Item>>>;

/// Apples `f` to each item in this iterator
///
/// # Returns
/// Error if `f` returns an error
///
/// Ok(Transformed) such that:
/// 1. `transformed` is true if any return from `f` had transformed true
/// 2. `data` from the last invocation of `f`
/// 3. `tnr` from the last invocation of `f` or `Continue` if the iterator is empty
fn map_and_collect<F: FnMut(Self::Item) -> Result<Transformed<Self::Item>>>(
self,
f: F,
) -> Result<Transformed<Vec<Self::Item>>>;
}

impl<I: Iterator> TreeNodeIterator for I {
Expand Down Expand Up @@ -771,6 +785,23 @@ impl<I: Iterator> TreeNodeIterator for I {
.collect::<Result<Vec<_>>>()
.map(|data| Transformed::new(data, transformed, tnr))
}

fn map_and_collect<F: FnMut(Self::Item) -> Result<Transformed<Self::Item>>>(
self,
mut f: F,
) -> Result<Transformed<Vec<Self::Item>>> {
let mut tnr = TreeNodeRecursion::Continue;
let mut transformed = false;
self.map(|item| {
f(item).map(|result| {
tnr = result.tnr;
transformed |= result.transformed;
result.data
})
})
.collect::<Result<Vec<_>>>()
.map(|data| Transformed::new(data, transformed, tnr))
}
}

/// Transformation helper to process a heterogeneous sequence of tree node containing
Expand Down
1 change: 1 addition & 0 deletions src/daft-algebra/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ common-error = {path = "../common/error", default-features = false}
common-treenode = {path = "../common/treenode", default-features = false}
daft-dsl = {path = "../daft-dsl", default-features = false}
daft-schema = {path = "../daft-schema", default-features = false}
indexmap = {workspace = true}

[dev-dependencies]
rstest = {workspace = true}
Expand Down
Loading

0 comments on commit 78beff4

Please sign in to comment.