Skip to content

Fix PushdownSort dropping LIMIT when eliminating SortExec#21744

Merged
alamb merged 3 commits intoapache:mainfrom
spiceai:sgrebnov/0420-topK-ensure-limit
Apr 25, 2026
Merged

Fix PushdownSort dropping LIMIT when eliminating SortExec#21744
alamb merged 3 commits intoapache:mainfrom
spiceai:sgrebnov/0420-topK-ensure-limit

Conversation

@sgrebnov
Copy link
Copy Markdown
Member

Which issue does this PR close?

When PushdownSort removes a SortExec because a source returns Exact (guaranteeing ordering), any fetch (LIMIT) on the SortExec is silently dropped if the underlying plan does not support with_fetch().

For example, ProjectionExec supports try_pushdown_sort (delegating to its child) but does not implement with_fetch(). A plan like SortExec(fetch=10) → ProjectionExec → source that gets sort-eliminated loses the limit.

What changes are included in this PR?

In the Exact branch of PushdownSort, when the eliminated SortExec carried a fetch:

  1. Try with_fetch() on the pushed-down source first
  2. If with_fetch() returns None, fall back to wrapping with GlobalLimitExec

Are these changes tested?

Yes. Three new unit tests:

  • test_sort_pushdown_exact_no_fetch_no_limit — Exact elimination without fetch: no limit wrapper added
  • test_sort_pushdown_exact_preserves_fetch_with_global_limit — Exact elimination with fetch, source does NOT support with_fetch(): GlobalLimitExec wrapper added
  • test_sort_pushdown_exact_preserves_fetch_with_source_support — Exact elimination with fetch, source supports with_fetch(): limit pushed into source directly

Are there any user-facing changes?

No.

Copy link
Copy Markdown
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, would be better if there are some slt tests

Also, do you want to include it into new minor release?

Comment thread datafusion/physical-optimizer/src/pushdown_sort.rs
Comment thread datafusion/physical-optimizer/src/pushdown_sort.rs Outdated
@alamb alamb added performance Make DataFusion faster and removed performance Make DataFusion faster labels Apr 22, 2026
Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sgrebnov and @martin-g and @xudong963

Comment thread datafusion/physical-optimizer/src/pushdown_sort.rs
@sgrebnov
Copy link
Copy Markdown
Member Author

LGTM, would be better if there are some slt tests

@xudong963 — There's an existing SLT test (sort_pushdown.slt Test 1.3) that already covers the Exact pushdown + LIMIT case where the source supports with_fetch(). The specific fallback path this PR fixes (GlobalLimitExec/LocalLimitExec wrapping when the source doesn't support with_fetch()) can't be triggered through SLT because all built-in data sources support with_fetch(). Adding a custom test-only table provider to the SLT harness just for this seems like overkill, so I'd propose keeping the unit tests with the configurable TestScan. WDYT?

@alamb alamb added this pull request to the merge queue Apr 25, 2026
@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 25, 2026

Thanks everyone. If it turns out we can make some slt tests perhaps we can add them as a follow on PR

Merged via the queue into apache:main with commit 6f1040b Apr 25, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate optimizer Optimizer rules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants