Commit 6d3a846
authored
## Which issue does this PR close?
- Part of #18489
- Closes #20180
- Closes #15524
- Replaces #20665
## Rationale for this change
I [want DataFusion to be the fastest parquet engine on
ClickBench](#18489). One of
the queries where DataFusion is significantly slower is Query 29 which
has a very strange pattern of many aggregate functions that are offset
by a constant:
https://github.com/apache/datafusion/blob/0ca9d6586a43c323525b2e299448e0f1af4d6195/benchmarks/queries/clickbench/queries/q29.sql#L4
This is not a pattern I have ever seen in a real query, but it seems
like the engine currently at the top of the ClickBench leaderboard has a
special case for this pattern. ClickHouse probably does too. See
- duckdb/duckdb#15017
- Discussion on #15524
Thus I reluctantly conclude that we should have one too.
## What changes are included in this PR?
This is an alternate to my first attempt.
- #20665
In particular, since this is such a ClickBench specific rule, I wanted
to
1. Minimize the downstream API / upgrade impact (aka not change existing
APIs)
2. Optimize performance for the case where this rewrite will not apply
(most times)
1. Add a rewrite `SUM(expr + scalar)` --> `SUM(expr) +
scalar*COUNT(expr)`
3. Tests for same
Note there are quite a few other ideas to potentially make this more
general on #15524 but I am
going with the simple thing of making it work for the usecase we have in
hand (ClickBench)
## Are these changes tested?
Yes, new tests are added
## Are there any user-facing changes?
Faster performance
🚀
```
│ QQuery 29 │ 1012.63 ms │ 139.02 ms │ +7.28x faster │
```
1 parent 5db04b8 commit 6d3a846
9 files changed
Lines changed: 574 additions & 44 deletions
File tree
- datafusion
- expr/src
- logical_plan
- functions-aggregate/src
- optimizer/src/simplify_expressions
- sqllogictest/test_files
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
600 | 600 | | |
601 | 601 | | |
602 | 602 | | |
603 | | - | |
| 603 | + | |
604 | 604 | | |
605 | 605 | | |
606 | 606 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3490 | 3490 | | |
3491 | 3491 | | |
3492 | 3492 | | |
3493 | | - | |
| 3493 | + | |
| 3494 | + | |
| 3495 | + | |
3494 | 3496 | | |
3495 | 3497 | | |
3496 | 3498 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
| |||
301 | 302 | | |
302 | 303 | | |
303 | 304 | | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
304 | 320 | | |
305 | 321 | | |
306 | 322 | | |
| |||
691 | 707 | | |
692 | 708 | | |
693 | 709 | | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
694 | 778 | | |
695 | 779 | | |
696 | 780 | | |
| |||
1243 | 1327 | | |
1244 | 1328 | | |
1245 | 1329 | | |
| 1330 | + | |
| 1331 | + | |
| 1332 | + | |
| 1333 | + | |
| 1334 | + | |
| 1335 | + | |
| 1336 | + | |
| 1337 | + | |
| 1338 | + | |
| 1339 | + | |
| 1340 | + | |
| 1341 | + | |
1246 | 1342 | | |
1247 | 1343 | | |
1248 | 1344 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| 30 | + | |
30 | 31 | | |
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
| 36 | + | |
| 37 | + | |
35 | 38 | | |
36 | 39 | | |
37 | 40 | | |
38 | 41 | | |
39 | | - | |
40 | | - | |
| 42 | + | |
| 43 | + | |
41 | 44 | | |
42 | 45 | | |
43 | 46 | | |
| |||
54 | 57 | | |
55 | 58 | | |
56 | 59 | | |
57 | | - | |
| 60 | + | |
58 | 61 | | |
59 | 62 | | |
60 | 63 | | |
| |||
346 | 349 | | |
347 | 350 | | |
348 | 351 | | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
349 | 393 | | |
350 | 394 | | |
351 | 395 | | |
| |||
0 commit comments