Commit 310dd5d
authored
Support Dictionary Arrays in MIN/MAX Aggregates (#21315)
## Which issue does this PR close?
* Closes #21150.
---
## Rationale for this change
The existing implementation of `min`/`max` does not correctly support
dictionary-encoded arrays. Previously, dictionary arrays were handled by
directly evaluating their underlying values array, which is semantically
incorrect because:
* It may include unreferenced values that do not appear in the logical
dataset
* It ignores nulls in the key array
* It does not preserve dictionary key semantics in scalar results
This leads to incorrect aggregation results for dictionary types.
This PR introduces a logical row-based evaluation for dictionary arrays
and ensures scalar comparisons correctly unwrap and rewrap dictionary
values when needed.
---
## What changes are included in this PR?
* Add logical row-based min/max computation (`scalar_row_extreme`) for:
* Dictionary arrays
* Struct, List, LargeList, and FixedSizeList types
* Replace previous dictionary handling that operated on `values()` with
correct row-wise evaluation
* Introduce `requires_logical_row_scan` to centralize fallback logic for
complex types
* Enhance scalar comparison logic:
* Unwrap dictionary scalars before comparison
* Rewrap results when both inputs are dictionaries with matching key
types
* Validate key type compatibility
* Improve error messaging for incompatible scalar comparisons
* Remove obsolete `min_max_batch_generic` implementation
---
## Are these changes tested?
Yes. Comprehensive tests have been added to validate correctness across
multiple scenarios:
* Basic dictionary min/max behavior
* Handling of null keys and null values
* Ignoring unreferenced dictionary values
* Multi-batch aggregation behavior
* Float dictionary handling including `NaN` and infinities
These tests ensure correctness and guard against regressions.
---
## Are there any user-facing changes?
Yes. The behavior of `min` and `max` on dictionary-encoded arrays is
now:
* Correct and semantically aligned with logical row values
* Consistent with other data types
Previously incorrect results may now differ, which is a correctness fix
rather than a breaking API change.
---
## LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated
content has been manually reviewed and tested.1 parent f802ed1 commit 310dd5d
2 files changed
Lines changed: 329 additions & 29 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | | - | |
| 21 | + | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
144 | | - | |
145 | 144 | | |
146 | 145 | | |
147 | | - | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
148 | 163 | | |
149 | 164 | | |
150 | 165 | | |
| |||
413 | 428 | | |
414 | 429 | | |
415 | 430 | | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
416 | 457 | | |
417 | 458 | | |
418 | | - | |
| 459 | + | |
419 | 460 | | |
420 | 461 | | |
421 | 462 | | |
422 | | - | |
| 463 | + | |
423 | 464 | | |
424 | 465 | | |
425 | 466 | | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
426 | 479 | | |
427 | 480 | | |
428 | 481 | | |
| |||
760 | 813 | | |
761 | 814 | | |
762 | 815 | | |
763 | | - | |
764 | | - | |
765 | | - | |
766 | | - | |
767 | | - | |
768 | | - | |
769 | | - | |
770 | | - | |
771 | | - | |
772 | | - | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
773 | 821 | | |
774 | 822 | | |
775 | 823 | | |
| |||
843 | 891 | | |
844 | 892 | | |
845 | 893 | | |
846 | | - | |
847 | | - | |
848 | | - | |
849 | | - | |
850 | | - | |
851 | | - | |
852 | | - | |
853 | | - | |
| 894 | + | |
| 895 | + | |
| 896 | + | |
| 897 | + | |
| 898 | + | |
854 | 899 | | |
855 | 900 | | |
856 | 901 | | |
| 902 | + | |
| 903 | + | |
| 904 | + | |
| 905 | + | |
| 906 | + | |
| 907 | + | |
| 908 | + | |
| 909 | + | |
| 910 | + | |
| 911 | + | |
| 912 | + | |
| 913 | + | |
| 914 | + | |
| 915 | + | |
| 916 | + | |
| 917 | + | |
| 918 | + | |
| 919 | + | |
| 920 | + | |
| 921 | + | |
| 922 | + | |
| 923 | + | |
| 924 | + | |
| 925 | + | |
| 926 | + | |
| 927 | + | |
| 928 | + | |
| 929 | + | |
| 930 | + | |
| 931 | + | |
| 932 | + | |
| 933 | + | |
| 934 | + | |
| 935 | + | |
| 936 | + | |
| 937 | + | |
| 938 | + | |
| 939 | + | |
| 940 | + | |
| 941 | + | |
| 942 | + | |
| 943 | + | |
| 944 | + | |
| 945 | + | |
| 946 | + | |
| 947 | + | |
| 948 | + | |
| 949 | + | |
| 950 | + | |
| 951 | + | |
| 952 | + | |
| 953 | + | |
| 954 | + | |
| 955 | + | |
| 956 | + | |
| 957 | + | |
| 958 | + | |
| 959 | + | |
| 960 | + | |
| 961 | + | |
| 962 | + | |
| 963 | + | |
| 964 | + | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
0 commit comments