Skip to content

feat: replace ScalarValue with DynComparator for RANGE calculation#20014

Open
akoshchiy wants to merge 4 commits intoapache:mainfrom
akoshchiy:15607-260125-window-range-calculation-optimization
Open

feat: replace ScalarValue with DynComparator for RANGE calculation#20014
akoshchiy wants to merge 4 commits intoapache:mainfrom
akoshchiy:15607-260125-window-range-calculation-optimization

Conversation

@akoshchiy
Copy link
Copy Markdown
Contributor

@akoshchiy akoshchiy commented Jan 26, 2026

Which issue does this PR close?

Rationale for this change

As discussed in #15607, range calculation is one of the bottlenecks in window processing because of lots ScalarValue allocations/operations in the inner loop. The changes in the PR are replacing ScalarValue with arrow's DynComparator for each frame bound, which are precalculated per batch.

The change gains about 70-80% performance improve in queries with PARTITION BY on low-cardinality columns, however on queries with high-cardinality partition columns there is a 20% regression because of the overhead of precomputing comparators on small batches.

Benchmark results (ryzen 9900x, ubuntu 25.10)

window_query_sql.rs

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query                                                         ┃      main ┃ 15607-260125-window-range-calculation-optimization ┃        Change ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ window empty over, aggregate functions                        │   6.01 ms │                                           6.19 ms  │  1.03x slower │
│ window empty over, built-in functions                         │ 107.45 ms │                                         107.52 ms  │     no change │
│ window order by, aggregate functions                          │ 692.38 ms │                                         244.96 ms  │ +2.83x faster │
│ window order by, built-in functions                           │ 600.14 ms │                                         143.65 ms  │ +4.18x faster │
│ window partition by, u64_wide, aggregate functions            │ 323.13 ms │                                         328.65 ms  │     no change │
│ window partition by, u64_narrow, aggregate functions          │   9.09 ms │                                           9.36 ms  │  1.03x slower │
│ window partition by, u64_wide, built-in functions             │ 349.49 ms │                                         359.59 ms  │  1.03x slower │
│ window partition by, u64_narrow, built-in functions           │  30.76 ms │                                          31.95 ms  │  1.04x slower │
│ window partition and order by, u64_wide, aggregate functions  │ 495.89 ms │                                         585.72 ms  │  1.18x slower │
│ window partition and order by, u64_narrow, aggregate functions│133.34 ms  │                                          63.84 ms  │ +2.09x faster │
│ window partition and order by, u64_wide, built-in functions   │ 490.42 ms │                                         601.34 ms  │  1.23x slower │
│ window partition and order by, u64_narrow, built-in functions │107.62 ms  │                                          37.80 ms  │ +2.85x faster │
└────────────────────────────────────────────────────────────── ┴───────────┴────────────────────────────────────────────────────┴───────────────┘

h2o_medium_window_parquet

┏━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃         main ┃ 15607-260125-window-range-calculation-optimization ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │    600.78 ms │                                          634.67 ms │  1.06x slower │
│ QQuery 2  │  19674.64 ms │                                         8248.85 ms │ +2.39x faster │
│ QQuery 3  │  13663.79 ms │                                        13705.98 ms │     no change │
│ QQuery 4  │   2706.63 ms │                                         2204.70 ms │ +1.23x faster │
│ QQuery 5  │   9174.88 ms │                                         9359.88 ms │     no change │
│ QQuery 6  │  16570.43 ms │                                        16269.44 ms │     no change │
│ QQuery 7  │  14184.85 ms │                                        14667.06 ms │     no change │
│ QQuery 8  │ 106938.60 ms │                                        81663.04 ms │ +1.31x faster │
│ QQuery 9  │   2332.07 ms │                                         2395.36 ms │     no change │
│ QQuery 10 │   2402.58 ms │                                         2506.51 ms │     no change │
│ QQuery 11 │   2444.16 ms │                                         2377.45 ms │     no change │
│ QQuery 12 │   4985.60 ms │                                         2696.91 ms │ +1.85x faster │
└───────────┴──────────────┴────────────────────────────────────────────────────┴───────────────┘

What changes are included in this PR?

Are these changes tested?

Covered by existing tests.

Are there any user-facing changes?

No

@github-actions github-actions Bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates core Core DataFusion crate labels Jan 26, 2026
@akoshchiy akoshchiy changed the title feat: window RANGE calculation optimization feat: window RANGE calculation optimization [WIP] Jan 26, 2026
@akoshchiy akoshchiy force-pushed the 15607-260125-window-range-calculation-optimization branch from ace5efa to b25dba0 Compare March 7, 2026 18:54
@github-actions github-actions Bot added common Related to common crate physical-plan Changes to the physical-plan crate optimizer Optimizer rules proto Related to proto crate labels Mar 7, 2026
@akoshchiy akoshchiy force-pushed the 15607-260125-window-range-calculation-optimization branch from d844df0 to ab6be02 Compare March 22, 2026 13:31
@github-actions github-actions Bot removed optimizer Optimizer rules proto Related to proto crate labels Mar 22, 2026
@akoshchiy akoshchiy force-pushed the 15607-260125-window-range-calculation-optimization branch from ab6be02 to 7bb5d3f Compare April 4, 2026 14:02
@github-actions github-actions Bot removed common Related to common crate physical-plan Changes to the physical-plan crate labels Apr 4, 2026
@akoshchiy akoshchiy changed the title feat: window RANGE calculation optimization [WIP] feat: replace ScalarValue with DynComparator for RANGE calculation Apr 4, 2026
@akoshchiy akoshchiy marked this pull request as ready for review April 4, 2026 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant