Skip to content

Commit 23b88fb

Browse files
Allow filters on struct fields to be pushed down into Parquet scan (#20822)
## Which issue does this PR close? - Related to #20603 ## Rationale for this change This PR enables Parquet row-level filter pushdown for struct field access expressions, which previously fell back to a full scan followed by a separate filtering pass, a significant perf penalty for queries filtering on struct fields in large Parquet files (like Variant types!) Filters on struct fields like `WHERE s['foo'] > 67` were not being pushed into the Parquet decoder. This is because `PushdownChecker` sees the underlying `Column("s")` has a `Struct` type and unconditionally rejects it, without considering that `get_field` resolves to a primitive leaf. With this change, deeply nested access like `s['outer']['inner']` will also get pushed down because the logical simplifier flattens it before it reaches the physical plan Note: this does not address the projection side and should not be blocked by it. `SELECT s['foo']` still reads the entire struct rather than just the needed leaf column. That requires separate changes to how the opener builds its projection mask.
1 parent af79d14 commit 23b88fb

4 files changed

Lines changed: 336 additions & 20 deletions

File tree

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

datafusion/datasource-parquet/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ datafusion-common-runtime = { workspace = true }
3939
datafusion-datasource = { workspace = true }
4040
datafusion-execution = { workspace = true }
4141
datafusion-expr = { workspace = true }
42+
datafusion-functions = { workspace = true }
4243
datafusion-functions-aggregate-common = { workspace = true }
4344
datafusion-physical-expr = { workspace = true }
4445
datafusion-physical-expr-adapter = { workspace = true }

0 commit comments

Comments
 (0)