Add batch pass-through optimization to SortPreservingMergeExec#21335
Add batch pass-through optimization to SortPreservingMergeExec#21335Dandandan wants to merge 1 commit intoapache:mainfrom
Conversation
When the loser tree winner's entire remaining batch is strictly less than every other stream's current value, skip per-row loser-tree comparisons and emit the batch directly. Two fast paths: - Zero-copy: when in_progress buffer is empty and the full batch qualifies, slice and return the RecordBatch without interleave - Bulk-push: otherwise append all qualifying rows at once, avoiding O(remaining × log K) loser-tree work The runner-up is found by walking the winner's loser-tree path (O(log K)), and the check is only performed at the start of each new batch to amortise cost. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
run benchmarks |
|
run benchmark sort_tpch |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing batch-pass-through-spm (4feba22) to 1e93a67 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing batch-pass-through-spm (4feba22) to 1e93a67 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing batch-pass-through-spm (4feba22) to 1e93a67 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing batch-pass-through-spm (4feba22) to 1e93a67 (merge-base) diff File an issue against this benchmark runner |
|
Benchmark for this request failed. Last 20 lines of output: Click to expandFile an issue against this benchmark runner |
|
run benchmark tpch_sort |
|
🤖 Criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing batch-pass-through-spm (4feba22) to 1e93a67 (merge-base) diff File an issue against this benchmark runner |
|
Benchmark for this request failed. Last 20 lines of output: Click to expandFile an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
Which issue does this PR close?
N/A — performance optimization
Rationale for this change
SortPreservingMergeExeccurrently processes every row through the loser tree, even when an entire batch from the winning stream is strictly less than all other streams' current values. This is common for:What changes are included in this PR?
Batch pass-through optimization: After the loser tree selects a winner, if the winner's cursor is at the start of a new batch, check whether the batch's last value is strictly less than the runner-up's current value (found via O(log K) loser-tree walk).
Two fast paths when the check succeeds:
Zero-copy: When
in_progressbuffer is empty and the full batch qualifies,RecordBatch::slicereturns the batch directly — no interleave, no row-level copying.Bulk-push: When partial rows are already buffered, append all qualifying rows at once via
push_rows, skipping O(remaining × log K) loser-tree comparisons.Files changed
cursor.rs: Addremaining(),last_cmp(),advance_by(),is_at_start()toCursorbuilder.rs: Addpush_rows()(bulk append) andtake_batch_slice()(zero-copy emission)merge.rs: Addfind_runner_up(),can_batch_pass_through(), modifypoll_next_innersort_preserving_merge.rs: 3 new tests covering non-overlapping, partial overlap, and fetch limitsAre these changes tested?
Yes — 3 new test cases plus all 18 existing tests pass:
test_batch_pass_through_multi_batch: Multiple non-overlapping batches per partitiontest_batch_pass_through_with_fetch: Fetch limit that cuts through a pass-through batchtest_batch_pass_through_partial_overlap: Mixed overlap across 3 partitionsAre there any user-facing changes?
No API changes. Existing behavior is preserved — the optimization is strictly an internal fast path that produces identical output.
🤖 Generated with Claude Code