Commit 09b7d33
committed
Disable prune_by_limit when output ordering is required
When a parquet scan has output ordering constraints (e.g. ORDER BY + LIMIT),
prune_by_limit must not discard partially-matched row groups. A partially-
matched group may contain rows that sort before any fully-matched group,
so skipping it returns incorrect results.
This matches the fix in upstream DataFusion (apache#21190) where
prune_by_limit is guarded by preserve_order.
Add preserve_order: bool to ParquetOpener. Set to true when output_ordering
is non-empty. Guard prune_by_limit with !preserve_order.1 parent 939ab61 commit 09b7d33
2 files changed
Lines changed: 14 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
118 | 118 | | |
119 | 119 | | |
120 | 120 | | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
121 | 125 | | |
122 | 126 | | |
123 | 127 | | |
| |||
262 | 266 | | |
263 | 267 | | |
264 | 268 | | |
| 269 | + | |
265 | 270 | | |
266 | 271 | | |
267 | 272 | | |
| |||
523 | 528 | | |
524 | 529 | | |
525 | 530 | | |
526 | | - | |
527 | | - | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
528 | 536 | | |
529 | 537 | | |
530 | 538 | | |
| |||
1076 | 1084 | | |
1077 | 1085 | | |
1078 | 1086 | | |
| 1087 | + | |
1079 | 1088 | | |
1080 | 1089 | | |
1081 | 1090 | | |
| |||
1101 | 1110 | | |
1102 | 1111 | | |
1103 | 1112 | | |
| 1113 | + | |
1104 | 1114 | | |
1105 | 1115 | | |
1106 | 1116 | | |
| |||
1208 | 1218 | | |
1209 | 1219 | | |
1210 | 1220 | | |
| 1221 | + | |
1211 | 1222 | | |
1212 | 1223 | | |
1213 | 1224 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
567 | 567 | | |
568 | 568 | | |
569 | 569 | | |
| 570 | + | |
570 | 571 | | |
571 | 572 | | |
572 | 573 | | |
| |||
0 commit comments