feat: replace ScalarValue with DynComparator for RANGE calculation#20014
Open
akoshchiy wants to merge 4 commits intoapache:mainfrom
Open
feat: replace ScalarValue with DynComparator for RANGE calculation#20014akoshchiy wants to merge 4 commits intoapache:mainfrom
ScalarValue with DynComparator for RANGE calculation#20014akoshchiy wants to merge 4 commits intoapache:mainfrom
Conversation
ace5efa to
b25dba0
Compare
d844df0 to
ab6be02
Compare
ab6be02 to
7bb5d3f
Compare
ScalarValue with DynComparator for RANGE calculation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
As discussed in #15607, range calculation is one of the bottlenecks in window processing because of lots
ScalarValueallocations/operations in the inner loop. The changes in the PR are replacingScalarValuewith arrow'sDynComparatorfor each frame bound, which are precalculated per batch.The change gains about 70-80% performance improve in queries with
PARTITION BYon low-cardinality columns, however on queries with high-cardinality partition columns there is a 20% regression because of the overhead of precomputing comparators on small batches.Benchmark results (ryzen 9900x, ubuntu 25.10)
window_query_sql.rs
h2o_medium_window_parquet
What changes are included in this PR?
Are these changes tested?
Covered by existing tests.
Are there any user-facing changes?
No