Commit b122ae2
committed
feat: TopK cumulative RG pruning after reorder (works with WHERE)
Instead of threshold-based pruning (which fails with WHERE clauses
due to unknown qualifying row counts), use cumulative row counting:
after reorder + reverse, accumulate rows from the front until we
have enough for the TopK fetch limit (K), then prune the rest.
This works for sort pushdown with or without WHERE because it only
depends on row counts + RG ordering, not threshold values or types.
Adds fetch field to DynamicFilterPhysicalExpr (set by SortExec) so
the parquet reader knows the TopK K value.
Keeps stats init for the no-WHERE sort pushdown case (20-58x speedup)
as a complementary optimization that also helps cross-file pruning
via the shared DynamicFilter.1 parent a7b4e25 commit b122ae2
4 files changed
Lines changed: 62 additions & 0 deletions
File tree
- datafusion
- datasource-parquet/src
- physical-expr/src/expressions
- physical-plan/src/sorts
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
381 | 381 | | |
382 | 382 | | |
383 | 383 | | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
384 | 400 | | |
385 | 401 | | |
386 | 402 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1222 | 1222 | | |
1223 | 1223 | | |
1224 | 1224 | | |
| 1225 | + | |
| 1226 | + | |
| 1227 | + | |
| 1228 | + | |
| 1229 | + | |
| 1230 | + | |
| 1231 | + | |
| 1232 | + | |
| 1233 | + | |
| 1234 | + | |
| 1235 | + | |
| 1236 | + | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
| 1240 | + | |
| 1241 | + | |
| 1242 | + | |
| 1243 | + | |
| 1244 | + | |
| 1245 | + | |
| 1246 | + | |
| 1247 | + | |
| 1248 | + | |
| 1249 | + | |
| 1250 | + | |
| 1251 | + | |
| 1252 | + | |
| 1253 | + | |
| 1254 | + | |
1225 | 1255 | | |
1226 | 1256 | | |
1227 | 1257 | | |
| |||
3343 | 3373 | | |
3344 | 3374 | | |
3345 | 3375 | | |
| 3376 | + | |
3346 | 3377 | | |
3347 | 3378 | | |
3348 | 3379 | | |
| |||
3389 | 3420 | | |
3390 | 3421 | | |
3391 | 3422 | | |
| 3423 | + | |
3392 | 3424 | | |
3393 | 3425 | | |
3394 | 3426 | | |
| |||
3480 | 3512 | | |
3481 | 3513 | | |
3482 | 3514 | | |
| 3515 | + | |
3483 | 3516 | | |
3484 | 3517 | | |
3485 | 3518 | | |
| |||
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
83 | 86 | | |
84 | 87 | | |
85 | 88 | | |
| |||
184 | 187 | | |
185 | 188 | | |
186 | 189 | | |
| 190 | + | |
187 | 191 | | |
188 | 192 | | |
189 | 193 | | |
| |||
196 | 200 | | |
197 | 201 | | |
198 | 202 | | |
| 203 | + | |
199 | 204 | | |
200 | 205 | | |
201 | 206 | | |
| 207 | + | |
202 | 208 | | |
203 | 209 | | |
204 | 210 | | |
| |||
207 | 213 | | |
208 | 214 | | |
209 | 215 | | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
210 | 221 | | |
211 | 222 | | |
212 | 223 | | |
| |||
396 | 407 | | |
397 | 408 | | |
398 | 409 | | |
| 410 | + | |
399 | 411 | | |
400 | 412 | | |
401 | 413 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
925 | 925 | | |
926 | 926 | | |
927 | 927 | | |
| 928 | + | |
928 | 929 | | |
929 | 930 | | |
930 | 931 | | |
| |||
0 commit comments