Skip to content

[fix](fe) Fix isCountStar incorrectly treating count(null) as count(*)#62548

Open
yujun777 wants to merge 1 commit intoapache:masterfrom
yujun777:fix/count-null-is-not-count-star
Open

[fix](fe) Fix isCountStar incorrectly treating count(null) as count(*)#62548
yujun777 wants to merge 1 commit intoapache:masterfrom
yujun777:fix/count-null-is-not-count-star

Conversation

@yujun777
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

Count.isCountStar() returns true for count(null) because it only checks child(0) instanceof Literal without excluding NullLiteral. Since count(null) always returns 0 (it counts non-null values of a constant null), it is semantically different from count(*) which counts all rows.

This could cause incorrect results if downstream rewrite rules (e.g. PushDownAggThroughJoin, SimplifyWindowExpression, PushCountIntoUnionAll) use isCountStar() before CountLiteralRewrite has a chance to rewrite count(null) to 0.

Release note

Fixed count(null) being incorrectly treated as count(*) in query optimizer, which could lead to wrong results in certain rewrite scenarios.

Check List (For Author)

  • Test: Unit Test / Regression test
    • FE UT: CountLiteralRewriteTest (3 tests pass, including new testCountNullIsNotCountStar)
    • Regression test: count_null_not_count_star covering basic, group by, mixed, union all, window, and join paths
  • Behavior changed: No
  • Does this need documentation: No

@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Apr 16, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777
Copy link
Copy Markdown
Contributor Author

run buildall

@yujun777
Copy link
Copy Markdown
Contributor Author

/review

morrySnow
morrySnow previously approved these changes Apr 16, 2026
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  1. regression-test/suites/nereids_rules_p0/count_null_not_count_star/count_null_not_count_star.groovy: the new window case does not reach SimplifyWindowExpression, so it does not actually validate the isCountStar() consumer named in the PR.
  2. regression-test/suites/nereids_rules_p0/count_null_not_count_star/count_null_not_count_star.groovy: the new join case does not reach PushDownAggThroughJoin's count(*) rewrite branch, so it also does not validate the motivating join path described in the PR.
  3. fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/agg/Count.java: this helper also gates AggregateStrategies (COUNT_ON_INDEX / COUNT_ON_MATCH), but the new tests do not cover that implementation path yet.

Critical Checkpoints

  • Goal: The one-line change in Count.isCountStar() fixes the direct count(null) misclassification. The new unit test proves the helper behavior, but the added regression suite does not fully prove all claimed downstream consumers.
  • Scope: The change is small, focused, and centralized in the right helper.
  • Concurrency: No concurrency, locking, or lifecycle risk is involved here.
  • Config / compatibility: No new config, protocol, or storage compatibility impact.
  • Parallel code paths: isCountStar() is shared by multiple rewrite and implementation rules. The code updates them centrally, but coverage is still incomplete for the window, join, and count-on-index consumers.
  • Special conditions: The new null-literal exclusion is the right semantic distinction.
  • Tests: CountLiteralRewriteTest improves direct coverage, and the union regression is useful. However the new window and join regressions do not hit the named rewrite rules, and there is still no coverage for the COUNT_ON_INDEX fast path. I also could not run FE UT successfully in this runner because thirdparty/installed/bin/protoc is missing.
  • Result files: The new .out file is consistent with the SQL in the added suite.
  • Observability / transaction / persistence / FE-BE variable passing: Not applicable for this change.
  • Performance: No new performance concern found.
  • Other: I did not find another blocking correctness bug in the code change itself; the blocking issue is coverage quality for the affected consumers.

Requesting changes so the added tests exercise the actual rewrite / implementation paths that motivated this fix.

return isStar
|| children.isEmpty()
|| (children.size() == 1 && child(0) instanceof Literal);
|| (children.size() == 1 && child(0) instanceof Literal && !child(0).isNullLiteral());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isCountStar() also gates the COUNT_ON_INDEX implementation rules in AggregateStrategies. The new regression suite only covers logical rewrites, so we still do not have a test proving that count(null) no longer gets pushed to COUNT_ON_MATCH / COUNT_ON_INDEX (for example with enable_count_on_index_pushdown and a MATCH predicate). Since this helper is now the gate for that FE->BE fast path too, please add at least one plan or regression case for it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added a direct FE plan test testCountNullNotPushedDown in PushDownCountThroughJoinTest.java that constructs Count(NullLiteral) with GROUP BY and verifies PushDownAggThroughJoin does NOT rewrite it. Also added a COUNT_ON_INDEX regression test with inverted index + explain shape plan asserting count(null) plan does NOT contain COUNT_ON_MATCH.

) t
"""

// count(null) as window function
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This window query does not actually exercise SimplifyWindowExpression. That rule only rewrites when the partition by keys are proven unique (window.getLogicalProperties().getTrait().isUnique(partitionSlotSet)), but this test has no partition keys, so it would skip the rule before and after the fix. Please add a unique-partition case (or a direct FE rule test) if the goal is to cover the isCountStar() consumer in SimplifyWindowExpression.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Window test kept as a correctness check (count(null)=0 in window context). SimplifyWindowExpression requires provably unique partition keys which is hard to set up in regression tests — the core isCountStar() behavior is already covered by FE unit test testCountNullIsNotCountStar.

select pk, count(null) over (order by pk) from count_null_test order by pk
"""

// count(null) through join must still return 0, not get rewritten by PushDownAggThroughJoin
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This join case does not reach PushDownAggThroughJoin's count(*) rewrite branch. With empty GROUP BY, PushDownAggThroughJoin#pushAgg returns null as soon as countStars is non-empty, so even the old buggy count(null) -> isCountStar() classification would not rewrite this query. To cover the motivating join scenario, the test needs a grouped join (or a plan test in PushDownCountThroughJoinTest) that actually enters the count(*) rewrite path.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Replaced with a grouped join: select t1.pk, count(null), count(*) from ... group by t1.pk. Also added FE plan test testCountNullNotPushedDown in PushDownCountThroughJoinTest.java applying only PushDownAggThroughJoin to verify count(null) is NOT pushed down.

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary: Count.isCountStar() returns true for count(null) because
it only checks `child(0) instanceof Literal` without excluding NullLiteral.
Since count(null) always returns 0 (it counts non-null values of a constant
null), it is semantically different from count(*) which counts all rows.

This could cause incorrect results if downstream rewrite rules (e.g.
PushDownAggThroughJoin, SimplifyWindowExpression, PushCountIntoUnionAll)
use isCountStar() before CountLiteralRewrite has a chance to rewrite
count(null) to 0.

### Release note

Fixed count(null) being incorrectly treated as count(*) in query optimizer,
which could lead to wrong results in certain rewrite scenarios.

### Check List (For Author)

- Test: Unit Test / Regression test
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yujun777 yujun777 force-pushed the fix/count-null-is-not-count-star branch from ca09b74 to aed0d9c Compare April 16, 2026 09:45
@yujun777
Copy link
Copy Markdown
Contributor Author

run buildall

@yujun777
Copy link
Copy Markdown
Contributor Author

/review

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Apr 16, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the semantic fix in Count.isCountStar() itself: excluding null literals is the right change, and the implementation is small and focused. The blocking issue is that the new coverage does not actually prove all of the downstream optimizer paths called out in the PR description.

  1. count_null_union does not exercise PushCountIntoUnionAll. CountLiteralRewrite runs earlier in the rewrite pipeline and folds count(null) to constant 0 before the union-all pushdown stage sees it, so this query would still pass if that rule regressed.
  2. count_null_window does not exercise SimplifyWindowExpression. That rule only rewrites when the PARTITION BY keys are unique; the added query has no PARTITION BY, so a regression in checkCount() / isCountStar() there would go unnoticed.
  3. The inverted-index case only checks the negative explain. If count(*) stopped using COUNT_ON_MATCH, the test would still pass because explainCountStar is never asserted.

Critical Checkpoints

  • Goal of current task: Prevent count(null) from being treated as count(*). The code change in Count.java does that locally, but the current tests do not fully prove the downstream paths named in the PR description.
  • Modification size/focus: Small and focused.
  • Concurrency: Not involved in the touched code.
  • Lifecycle/static initialization: Not involved.
  • Config changes: None.
  • Compatibility / FE-BE protocol / persistence: None.
  • Parallel code paths: Yes. Several optimizer paths depend on row-count semantics. The added coverage does not actually pin PushCountIntoUnionAll, does not hit SimplifyWindowExpression, and only partially checks the COUNT_ON_INDEX path.
  • Special conditions: The new null-literal guard is semantically appropriate.
  • Test coverage: Incomplete for the risky paths above.
  • Test result updates: The added .out file matches the SQL results that are asserted.
  • Observability: No change needed.
  • Transactionality / data writes / new FE-BE variables: Not applicable.
  • Performance: No concern from the code change itself.
  • Other issues: None beyond the missing coverage.

Please tighten the new tests so they actually exercise the optimizer branches this PR is claiming to fix.

order_qt_count_mixed """select count(null), count(*), count(a), count(b) from count_null_test"""

// count(null) in subquery with union all
order_qt_count_null_union """
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count(null) has already been rewritten to constant 0 by CountLiteralRewrite before the eager-aggregation stage reaches PushCountIntoUnionAll, so this query does not actually pin the union-all pushdown path mentioned in the PR description. It would still pass even if PushCountIntoUnionAll mishandled count(null). Please add a rule-level FE test or a plan assertion that exercises that rewrite directly.

"""

// count(null) as window function should still return 0
order_qt_count_null_window """
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This window case does not reach SimplifyWindowExpression. That rule only rewrites when the PARTITION BY keys are unique, and an empty partition-key set is not treated as unique. With count(null) over (order by pk) the regression only checks result correctness, not the optimizer branch called out in the PR description. Please add a PARTITION BY case on a unique key such as pk so checkCount() / isCountStar() is exercised.


// count(*) with MATCH predicate may use COUNT_ON_MATCH optimization
// count(null) with same predicate must NOT — verify via explain
def explainCountStar = sql """
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the negative explain is asserted here. If the positive optimization for count(*) stopped producing COUNT_ON_MATCH, this test would still pass because explainCountStar is unused. Please also assert that the count(*) explain contains COUNT_ON_MATCH or COUNT_ON_INDEX so this case proves the optimization path is actually exercised.

@yujun777
Copy link
Copy Markdown
Contributor Author

run feut

@yujun777
Copy link
Copy Markdown
Contributor Author

run vault_p0

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/1) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants