Skip to content

fix(aggregate): show aliased expr in explain#21739

Open
kumarUjjawal wants to merge 3 commits intoapache:mainfrom
kumarUjjawal:fix/aliased_expr_explain
Open

fix(aggregate): show aliased expr in explain#21739
kumarUjjawal wants to merge 3 commits intoapache:mainfrom
kumarUjjawal:fix/aliased_expr_explain

Conversation

@kumarUjjawal
Copy link
Copy Markdown
Contributor

@kumarUjjawal kumarUjjawal commented Apr 20, 2026

Which issue does this PR close?

Rationale for this change

Physical explain output only showed the alias for aliased aggregates. That made it hard to understand the plan, especially when the aggregate had a filter, explicit RESPECT NULLS, or a custom UDAF display.

What changes are included in this PR?

  • Show the full aggregate expression in physical explain for user-written aggregate aliases.
  • Keep internal aliases like count(*) compact in physical explain.
  • Replace the old hidden metadata approach with an explicit is_internal flag on Alias.
  • Preserve that flag through planner rewrites, tree rewrites, and proto round-trip.
  • Add tests for aliased aggregate explain output, including:
    • normal aliased aggregates
    • quoted aliases
    • explicit RESPECT NULLS
    • custom human display
    • count(*)
    • nested internal alias display
  • Add an upgrade note for the public Alias API change.

Are these changes tested?

Yes

Are there any user-facing changes?

Yes.

  • Physical explain output is clearer for aliased aggregate expressions.
  • Alias now has a new is_internal field.

This is a public API change for users who build or pattern match Alias directly. The upgrade guide has been updated with the needed changes.

@github-actions github-actions bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate physical-plan Changes to the physical-plan crate labels Apr 20, 2026
@kumarUjjawal
Copy link
Copy Markdown
Contributor Author

cc @pepijnve

@kumarUjjawal kumarUjjawal marked this pull request as draft April 20, 2026 07:56
@kumarUjjawal
Copy link
Copy Markdown
Contributor Author

It has much more cases than I realized initially 😢

@kumarUjjawal
Copy link
Copy Markdown
Contributor Author

kumarUjjawal commented Apr 20, 2026

I am working on another approach which is looking better than this

I updated the code with new changes along with pr body

@github-actions github-actions bot added documentation Improvements or additions to documentation sql SQL Planner logical-expr Logical plan and expressions optimizer Optimizer rules proto Related to proto crate functions Changes to functions implementation labels Apr 20, 2026
…plain

# Conflicts:
#	datafusion/optimizer/src/decorrelate_lateral_join.rs
@kumarUjjawal kumarUjjawal marked this pull request as ready for review April 20, 2026 12:34
@kumarUjjawal
Copy link
Copy Markdown
Contributor Author

I tried two approaches for this.

The first approach was by hiding the bit in alias metadata. It fixed the display problem, but it also mixed planner-only state with user metadata. That made the behavior harder to reason about and forced extra handling in places like equality, hashing, serialization, and rewrite logic. In practice, a simple display fix startedaffecting alias identity and metadata flow in too many places.

The current approach adds an explicit internal flag to alias. This is a small API break, but it makes the model much clearer: whether an alias is user written or planner generated is now part of the type itself, not hidden in metadata. That keeps the display logic direct, avoids hidden state leaking into unrelated code paths, and makes future maintenance safer because the intent is visible and compiler checked.

Would love to hear your thoughts @pepijnve

@pepijnve
Copy link
Copy Markdown
Contributor

@kumarUjjawal could you give an example of where/when the internal flag is necessary?

@pepijnve
Copy link
Copy Markdown
Contributor

I think I might have found it in your second commit. Where this change was reverted.

image

Wouldn't we want the left version though? The physical plan is actually executing count(1), but you don't see that at all in the physical plan even though the logical plan does show it.

.map(|expr| expr.human_display())
.map(|expr| {
let human_display = expr.human_display();
if human_display.is_empty() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should human_display be Option<String>?

}

#[tokio::test]
async fn test_aggregate_explain_shows_aliased_expression() -> Result<()> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These might be more concise as SLTs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate documentation Improvements or additions to documentation functions Changes to functions implementation logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate proto Related to proto crate sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Aliased aggregation expressions not visible in physical explain output

2 participants