[fix](fe) Fix isCountStar incorrectly treating count(null) as count(*) by yujun777 · Pull Request #62548 · apache/doris

yujun777 · 2026-04-16T03:56:41Z

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

Count.isCountStar() returns true for count(null) because it only checks child(0) instanceof Literal without excluding NullLiteral. Since count(null) always returns 0 (it counts non-null values of a constant null), it is semantically different from count(*) which counts all rows.

This could cause incorrect results if downstream rewrite rules (e.g. PushDownAggThroughJoin, SimplifyWindowExpression, PushCountIntoUnionAll) use isCountStar() before CountLiteralRewrite has a chance to rewrite count(null) to 0.

Release note

Fixed count(null) being incorrectly treated as count(*) in query optimizer, which could lead to wrong results in certain rewrite scenarios.

Check List (For Author)

Test: Unit Test / Regression test
- FE UT: CountLiteralRewriteTest (3 tests pass, including new testCountNullIsNotCountStar)
- Regression test: count_null_not_count_star covering basic, group by, mixed, union all, window, and join paths
Behavior changed: No
Does this need documentation: No

Thearas · 2026-04-16T03:56:47Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

yujun777 · 2026-04-16T03:56:50Z

run buildall

yujun777 · 2026-04-16T03:56:51Z

/review

github-actions · 2026-04-16T03:58:48Z

PR approved by at least one committer and no changes requested.

github-actions · 2026-04-16T03:58:50Z

PR approved by anyone and no changes requested.

github-actions

Findings

regression-test/suites/nereids_rules_p0/count_null_not_count_star/count_null_not_count_star.groovy: the new window case does not reach SimplifyWindowExpression, so it does not actually validate the isCountStar() consumer named in the PR.
regression-test/suites/nereids_rules_p0/count_null_not_count_star/count_null_not_count_star.groovy: the new join case does not reach PushDownAggThroughJoin's count(*) rewrite branch, so it also does not validate the motivating join path described in the PR.
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/agg/Count.java: this helper also gates AggregateStrategies (COUNT_ON_INDEX / COUNT_ON_MATCH), but the new tests do not cover that implementation path yet.

Critical Checkpoints

Goal: The one-line change in Count.isCountStar() fixes the direct count(null) misclassification. The new unit test proves the helper behavior, but the added regression suite does not fully prove all claimed downstream consumers.
Scope: The change is small, focused, and centralized in the right helper.
Concurrency: No concurrency, locking, or lifecycle risk is involved here.
Config / compatibility: No new config, protocol, or storage compatibility impact.
Parallel code paths: isCountStar() is shared by multiple rewrite and implementation rules. The code updates them centrally, but coverage is still incomplete for the window, join, and count-on-index consumers.
Special conditions: The new null-literal exclusion is the right semantic distinction.
Tests: CountLiteralRewriteTest improves direct coverage, and the union regression is useful. However the new window and join regressions do not hit the named rewrite rules, and there is still no coverage for the COUNT_ON_INDEX fast path. I also could not run FE UT successfully in this runner because thirdparty/installed/bin/protoc is missing.
Result files: The new .out file is consistent with the SQL in the added suite.
Observability / transaction / persistence / FE-BE variable passing: Not applicable for this change.
Performance: No new performance concern found.
Other: I did not find another blocking correctness bug in the code change itself; the blocking issue is coverage quality for the affected consumers.

Requesting changes so the added tests exercise the actual rewrite / implementation paths that motivated this fix.

github-actions · 2026-04-16T04:26:28Z

        return isStar
                || children.isEmpty()
-                || (children.size() == 1 && child(0) instanceof Literal);
+                || (children.size() == 1 && child(0) instanceof Literal && !child(0).isNullLiteral());


isCountStar() also gates the COUNT_ON_INDEX implementation rules in AggregateStrategies. The new regression suite only covers logical rewrites, so we still do not have a test proving that count(null) no longer gets pushed to COUNT_ON_MATCH / COUNT_ON_INDEX (for example with enable_count_on_index_pushdown and a MATCH predicate). Since this helper is now the gate for that FE->BE fast path too, please add at least one plan or regression case for it.

Fixed. Added a direct FE plan test testCountNullNotPushedDown in PushDownCountThroughJoinTest.java that constructs Count(NullLiteral) with GROUP BY and verifies PushDownAggThroughJoin does NOT rewrite it. Also added a COUNT_ON_INDEX regression test with inverted index + explain shape plan asserting count(null) plan does NOT contain COUNT_ON_MATCH.

github-actions · 2026-04-16T04:26:28Z

+        ) t
+    """
+
+    // count(null) as window function


This window query does not actually exercise SimplifyWindowExpression. That rule only rewrites when the partition by keys are proven unique (window.getLogicalProperties().getTrait().isUnique(partitionSlotSet)), but this test has no partition keys, so it would skip the rule before and after the fix. Please add a unique-partition case (or a direct FE rule test) if the goal is to cover the isCountStar() consumer in SimplifyWindowExpression.

Fixed. Window test kept as a correctness check (count(null)=0 in window context). SimplifyWindowExpression requires provably unique partition keys which is hard to set up in regression tests — the core isCountStar() behavior is already covered by FE unit test testCountNullIsNotCountStar.

github-actions · 2026-04-16T04:26:28Z

+        select pk, count(null) over (order by pk) from count_null_test order by pk
+    """
+
+    // count(null) through join must still return 0, not get rewritten by PushDownAggThroughJoin


This join case does not reach PushDownAggThroughJoin's count(*) rewrite branch. With empty GROUP BY, PushDownAggThroughJoin#pushAgg returns null as soon as countStars is non-empty, so even the old buggy count(null) -> isCountStar() classification would not rewrite this query. To cover the motivating join scenario, the test needs a grouped join (or a plan test in PushDownCountThroughJoinTest) that actually enters the count(*) rewrite path.

Fixed. Replaced with a grouped join: select t1.pk, count(null), count(*) from ... group by t1.pk. Also added FE plan test testCountNullNotPushedDown in PushDownCountThroughJoinTest.java applying only PushDownAggThroughJoin to verify count(null) is NOT pushed down.

### What problem does this PR solve? Issue Number: close #xxx Problem Summary: Count.isCountStar() returns true for count(null) because it only checks `child(0) instanceof Literal` without excluding NullLiteral. Since count(null) always returns 0 (it counts non-null values of a constant null), it is semantically different from count(*) which counts all rows. This could cause incorrect results if downstream rewrite rules (e.g. PushDownAggThroughJoin, SimplifyWindowExpression, PushCountIntoUnionAll) use isCountStar() before CountLiteralRewrite has a chance to rewrite count(null) to 0. ### Release note Fixed count(null) being incorrectly treated as count(*) in query optimizer, which could lead to wrong results in certain rewrite scenarios. ### Check List (For Author) - Test: Unit Test / Regression test - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

yujun777 · 2026-04-16T09:46:35Z

run buildall

yujun777 · 2026-04-16T09:46:37Z

/review

github-actions

I agree with the semantic fix in Count.isCountStar() itself: excluding null literals is the right change, and the implementation is small and focused. The blocking issue is that the new coverage does not actually prove all of the downstream optimizer paths called out in the PR description.

count_null_union does not exercise PushCountIntoUnionAll. CountLiteralRewrite runs earlier in the rewrite pipeline and folds count(null) to constant 0 before the union-all pushdown stage sees it, so this query would still pass if that rule regressed.
count_null_window does not exercise SimplifyWindowExpression. That rule only rewrites when the PARTITION BY keys are unique; the added query has no PARTITION BY, so a regression in checkCount() / isCountStar() there would go unnoticed.
The inverted-index case only checks the negative explain. If count(*) stopped using COUNT_ON_MATCH, the test would still pass because explainCountStar is never asserted.

Critical Checkpoints

Goal of current task: Prevent count(null) from being treated as count(*). The code change in Count.java does that locally, but the current tests do not fully prove the downstream paths named in the PR description.
Modification size/focus: Small and focused.
Concurrency: Not involved in the touched code.
Lifecycle/static initialization: Not involved.
Config changes: None.
Compatibility / FE-BE protocol / persistence: None.
Parallel code paths: Yes. Several optimizer paths depend on row-count semantics. The added coverage does not actually pin PushCountIntoUnionAll, does not hit SimplifyWindowExpression, and only partially checks the COUNT_ON_INDEX path.
Special conditions: The new null-literal guard is semantically appropriate.
Test coverage: Incomplete for the risky paths above.
Test result updates: The added .out file matches the SQL results that are asserted.
Observability: No change needed.
Transactionality / data writes / new FE-BE variables: Not applicable.
Performance: No concern from the code change itself.
Other issues: None beyond the missing coverage.

Please tighten the new tests so they actually exercise the optimizer branches this PR is claiming to fix.

github-actions · 2026-04-16T10:33:15Z

+    order_qt_count_mixed """select count(null), count(*), count(a), count(b) from count_null_test"""
+
+    // count(null) in subquery with union all
+    order_qt_count_null_union """


count(null) has already been rewritten to constant 0 by CountLiteralRewrite before the eager-aggregation stage reaches PushCountIntoUnionAll, so this query does not actually pin the union-all pushdown path mentioned in the PR description. It would still pass even if PushCountIntoUnionAll mishandled count(null). Please add a rule-level FE test or a plan assertion that exercises that rewrite directly.

github-actions · 2026-04-16T10:33:15Z

+    """
+
+    // count(null) as window function should still return 0
+    order_qt_count_null_window """


This window case does not reach SimplifyWindowExpression. That rule only rewrites when the PARTITION BY keys are unique, and an empty partition-key set is not treated as unique. With count(null) over (order by pk) the regression only checks result correctness, not the optimizer branch called out in the PR description. Please add a PARTITION BY case on a unique key such as pk so checkCount() / isCountStar() is exercised.

github-actions · 2026-04-16T10:33:15Z

+
+    // count(*) with MATCH predicate may use COUNT_ON_MATCH optimization
+    // count(null) with same predicate must NOT — verify via explain
+    def explainCountStar = sql """


Only the negative explain is asserted here. If the positive optimization for count(*) stopped producing COUNT_ON_MATCH, this test would still pass because explainCountStar is unused. Please also assert that the count(*) explain contains COUNT_ON_MATCH or COUNT_ON_INDEX so this case proves the optimization path is actually exercised.

yujun777 · 2026-04-16T11:39:29Z

run feut

yujun777 · 2026-04-16T11:39:39Z

run vault_p0

hello-stephen · 2026-04-16T12:25:59Z

FE Regression Coverage Report

Increment line coverage 0.00% (0/1) 🎉
Increment coverage report
Complete coverage report

morrySnow previously approved these changes Apr 16, 2026

View reviewed changes

morrySnow added dev/4.0.x dev/4.1.x labels Apr 16, 2026

github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 16, 2026

github-actions bot added the reviewed label Apr 16, 2026

github-actions bot requested changes Apr 16, 2026

View reviewed changes

yujun777 dismissed morrySnow’s stale review via aed0d9c April 16, 2026 09:45

yujun777 force-pushed the fix/count-null-is-not-count-star branch from ca09b74 to aed0d9c Compare April 16, 2026 09:45

github-actions bot removed the approved Indicates a PR has been approved by one committer. label Apr 16, 2026

github-actions bot requested changes Apr 16, 2026

View reviewed changes

Conversation

yujun777 commented Apr 16, 2026

What problem does this PR solve?

Release note

Check List (For Author)

Uh oh!

Thearas commented Apr 16, 2026

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

hello-stephen commented Apr 16, 2026

FE Regression Coverage Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants