Skip to content

Commit aad61ce

Browse files
committed
fix: only reverse/cumulate when reorder succeeds (prevents ClickBench regression)
Reverse and cumulative pruning from DynamicFilter now only trigger when reorder_optimizer is Some (the sort column was found in parquet stats). For GROUP BY + ORDER BY queries, the sort column is an aggregate output not in parquet — reorder bails out, so reverse and cumulative prune should also skip. Previously, reverse ran regardless, changing I/O patterns with no benefit (Q23 2x slower in ClickBench).
1 parent ca94342 commit aad61ce

1 file changed

Lines changed: 10 additions & 10 deletions

File tree

datafusion/datasource-parquet/src/opener.rs

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1221,17 +1221,17 @@ impl RowGroupsPrunedParquetOpen {
12211221
None
12221222
};
12231223

1224-
// Reverse for DESC queries. Triggered by sort pushdown OR by
1225-
// DynamicFilter indicating DESC. For non-sort-pushdown TopK, reverse
1226-
// ensures cumulative pruning keeps the highest-value RGs. TopK
1227-
// handles final row-level sorting regardless.
1224+
// Reverse for DESC queries. Only when reorder is active (the sort
1225+
// column exists in parquet stats). Without reorder, reversing RGs
1226+
// randomly changes I/O patterns with no benefit.
12281227
let is_descending = prepared.reverse_row_groups
1229-
|| prepared
1230-
.predicate
1231-
.as_ref()
1232-
.and_then(find_dynamic_filter)
1233-
.and_then(|df| df.sort_options().map(|opts| opts[0].descending))
1234-
.unwrap_or(false);
1228+
|| (reorder_optimizer.is_some()
1229+
&& prepared
1230+
.predicate
1231+
.as_ref()
1232+
.and_then(find_dynamic_filter)
1233+
.and_then(|df| df.sort_options().map(|opts| opts[0].descending))
1234+
.unwrap_or(false));
12351235
let reverse_optimizer: Option<
12361236
Box<dyn crate::access_plan_optimizer::AccessPlanOptimizer>,
12371237
> = if is_descending {

0 commit comments

Comments
 (0)