You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix fully matched row groups with null counts (#21907)
## Which issue does this PR close?
- Related to #21637
## Rationale for this change
This is split out from review feedback on #21637. Row groups can only be
marked fully matched when all rows are guaranteed to pass the filter.
For nullable predicate columns, proving `NOT(predicate)` is not enough
because rows where the predicate evaluates to NULL do not pass the
filter.
## What changes are included in this PR?
This PR makes the fully matched row-group proof conservative for nulls
by adding `IS NULL` checks for nullable columns referenced by the
predicate before evaluating the inverted pruning predicate.
It also threads `with_missing_null_counts_as_zero` through
`RowGroupPruningStatistics` so normal row-group pruning keeps the
existing default behavior, while fully matched proofs treat missing null
counts as unknown. This reuses the existing statistics conversion path
instead of adding a separate null-count conversion pass.
## Are these changes tested?
Added a regression test covering row groups with known nulls, known zero
nulls, and missing null counts.
## Are there any user-facing changes?
No API changes. This only prevents false positives in the row-group
fully matched optimization.
---------
Co-authored-by: Dmitrii Blaginin <dmitrii@blaginin.me>
0 commit comments