Skip to content

Commit 56e097a

Browse files
perf: optimize scatter with type-specific specialization (#20498)
## Which issue does this PR close? Related to #11570 (scatter optimization suggested in #19994 (comment)) ## Rationale for this change Profiling shows scatter consumes 50%+ of elapsed time in the "10% zeroes" divide-by-zero protection benchmark. The current implementation uses the generic MutableArrayData path for all array types. ## What changes are included in this PR? Apply the same type-specific specialization strategy used in arrow-rs filter.rs to scatter: - Selectivity-based iteration: set_slices() for high selectivity, set_indices() for low selectivity - Type-specific dispatch via downcast_primitive_array! for primitives, boolean, bytes, byte views, dictionary, etc. - Fast paths for all-true and all-false masks - Fallback to MutableArrayData for unsupported types ## Are these changes tested? Existing 4 scatter tests preserved. Additional tests will be added. ## Are there any user-facing changes? No. Public API signature is unchanged.
1 parent cdaecf0 commit 56e097a

1 file changed

Lines changed: 471 additions & 14 deletions

File tree

  • datafusion/physical-expr-common/src

0 commit comments

Comments
 (0)