Background
In #21580, DynamicFilterPhysicalExpr was given a fetch: Option<usize> field that stores the TopK K value. This value is copied from SortExec.fetch at filter creation time in SortExec::create_filter().
Problem
If a future optimizer calls SortExec::with_fetch() independently (without recreating the filter), the DynamicFilterPhysicalExpr.fetch would go stale. Currently this doesn't happen because create_filter and fetch are set in the same method, but the coupling is fragile.
Raised by @xudong963 in #21580 (comment)
Proposed fix
Replace fetch: Option<usize> with a shared reference (e.g., Arc<AtomicUsize>) that reads directly from SortExec.fetch. This way any update to fetch is automatically visible to the parquet reader's stats init and cumulative pruning logic.
Related
Background
In #21580,
DynamicFilterPhysicalExprwas given afetch: Option<usize>field that stores the TopK K value. This value is copied fromSortExec.fetchat filter creation time inSortExec::create_filter().Problem
If a future optimizer calls
SortExec::with_fetch()independently (without recreating the filter), theDynamicFilterPhysicalExpr.fetchwould go stale. Currently this doesn't happen becausecreate_filterandfetchare set in the same method, but the coupling is fragile.Raised by @xudong963 in #21580 (comment)
Proposed fix
Replace
fetch: Option<usize>with a shared reference (e.g.,Arc<AtomicUsize>) that reads directly fromSortExec.fetch. This way any update to fetch is automatically visible to the parquet reader's stats init and cumulative pruning logic.Related
fetchonDynamicFilterPhysicalExpr