Add metric category filtering for EXPLAIN ANALYZE by adriangb · Pull Request #21160 · apache/datafusion

adriangb · 2026-03-25T22:14:11Z

Summary

Adds MetricCategory enum (Rows, Bytes, Timing) classifying metrics by what they measure and, critically, their determinism: rows/bytes are deterministic given the same plan+data; timing varies across runs.
Each Metric can now declare its category via MetricBuilder::with_category(). Well-known builder methods (output_rows, elapsed_compute, output_bytes, etc.) set the category automatically. Custom counters/gauges default to "always included".
New session config datafusion.explain.analyze_categories accepts all (default), none, or comma-separated rows, bytes, timing.
This is orthogonal to the existing analyze_level (summary/dev) which controls verbosity.

Motivation

Running EXPLAIN ANALYZE in .slt tests currently requires liberal use of <slt:ignore> for every non-deterministic timing metric. With this change, a test can simply:

SET datafusion.explain.analyze_categories = 'rows';
EXPLAIN ANALYZE SELECT ...;
-- output contains only row-count metrics — fully deterministic, no <slt:ignore> needed

In particular, for dynamic filters we have relatively complex integration tests that exist mostly to assert the plan shapes and state of the dynamic filters after the plan has been executed. For example #21059. With this change I think most of those can be moved to SLT tests. I've also wanted to e.g. make assertions about pruning effectiveness without having timing information included.

Test plan

New Rust integration test explain_analyze_categories covering all combos (rows, none, all, rows+bytes)
New .slt tests in explain_analyze.slt for rows, none, rows,bytes, and rows with dev level
Existing explain_analyze integration tests pass (24/24)
Proto roundtrip test updated and passing
information_schema slt updated for new config entry
Full core_integration suite passes (918 tests)

🤖 Generated with Claude Code

2010YOUY01 · 2026-03-26T01:25:33Z

+    ///
+    /// **Non-deterministic** — varies across runs even on the same hardware.
+    Timing,
+}


Perhaps we can add another variant for all uncategorized metrics, so we can do

set datafusion.explain.analyze_category = 'rows, bytes, uncategorized' -- Only exclude `Timing` metrics category

Introduces `MetricCategory` (Rows, Bytes, Timing) so that EXPLAIN ANALYZE output can be narrowed to only deterministic metrics, which is especially useful in sqllogictest (.slt) files where timing values would otherwise require `<slt:ignore>` markers everywhere. Each `Metric` now optionally declares a category via `MetricBuilder::with_category()`. Well-known builder methods (`output_rows`, `elapsed_compute`, …) set the category automatically; custom counters/gauges default to "always included". A new session config `datafusion.explain.analyze_categories` accepts `all` (default), `none`, or a comma-separated list of `rows`, `bytes`, `timing` to control which categories appear. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Use a parquet table with multiple row groups and a TopK ORDER BY LIMIT query that triggers DynamicFilter pushdown. This makes the slt examples much more realistic — they show pruning metrics, row group statistics, and the resolved DynamicFilter predicate. Add a 'timing' category example that shows only elapsed_compute and metadata_load_time (with <slt:ignore> since they are non-deterministic). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Opt in every custom counter/gauge created by DataFusion's own operators (parquet, file_stream, joins, aggregates, topk, unnest, buffer) so that category filtering works cleanly out of the box. For example `bytes_scanned` → Bytes, `pushdown_rows_pruned` → Rows, `peak_mem_used` → Bytes, `row_replacements` → Rows, etc. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

adriangb · 2026-03-27T00:10:58Z

 #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
-pub enum ExplainAnalyzeLevel {
-    /// Show a compact view containing high-level metrics
+pub enum MetricType {


@2010YOUY01 this is now a bigger diff because I refactored MetricType into here. It removes the redundancy with ExplainAnalyzeLevel (they were the same enum). I really wanted to have MetricCategory in this module (or inside datafusion-common so that it's available to this module) so that we can parse the strings directly into the enum variants and avoid carrying around Vec<String>s. I think this also applies to MetricType. Now both code paths are parallel, there is less duplicate code and validation happens earlier.

One thing to confirm is that we are okay with having MetricType and MetricCategory. They sound somewhat redundant but I think are actually orthogonal, unless we can say something like "all timing metrics are Dev" in which case one is a superset of the other. I'd like your input here since you added MetricType

I agree the new category tags are necessary

2010YOUY01

Thanks, this feature LGTM overall. I have one suggestion in the above comment: consider allowing a metric to have multiple associated categories. Curious to hear your thoughts.

2010YOUY01 · 2026-03-27T05:01:44Z

 #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
-pub enum ExplainAnalyzeLevel {
-    /// Show a compact view containing high-level metrics
+pub enum MetricType {


I agree the new category tags are necessary

2010YOUY01 · 2026-03-27T05:27:51Z

+    ///
+    /// See [`MetricCategory`] for details on the determinism properties
+    /// of each category.
+    pub fn with_category(mut self, category: MetricCategory) -> Self {


I'm wondering if some metrics may need multiple categories.

Currently Timing/Bytes/Rows classify metrics by output type, but we could also group them by functional aspects like Pruning or Spilling. That would enable more flexible filtering, e.g.:

set datafusion.explain.analyze_categories = ['Timing', 'Spilling']

to show metrics matching both tags.

If this direction is plausible, should we consider storing multiple categories (e.g. Vec<MetricCategory>) instead? To be more flexible for the future change.

Could we add that as a followup feature? It wouldn't be breaking I think.

That makes sense, if it won't need breaking changes.

## Which issue does this PR close?  - Closes #. ## Rationale for this change  Due to parallel commits' conflict in #21211 and #21160, the CI is failing, this PR fixes the issue. ## What changes are included in this PR?  ## Are these changes tested?  ## Are there any user-facing changes?

## Which issue does this PR close? - Closes #. ## Rationale for this change [#21160](#21160) added `datafusion.explain.analyze_categories`, which lets `EXPLAIN ANALYZE` emit only deterministic metric categories (e.g. `'rows'`). That unlocked a long-standing blocker on porting tests out of `datafusion/core/tests/physical_optimizer/filter_pushdown.rs`: previously these tests had to assert on execution state via `insta` snapshots over hand-wired `ExecutionPlan` trees and mock `TestSource` data, which kept them expensive to read, expensive to update, and impossible to test from the user-facing SQL path. With `analyze_categories = 'rows'`, the `predicate=DynamicFilter [ ... ]` text on a parquet scan is stable across runs, so the same invariants can now be expressed as plain `EXPLAIN ANALYZE` SQL in sqllogictest, where they are easier to read, easier to update, and exercise the full SQL → logical optimizer → physical optimizer → execution pipeline rather than a single optimizer rule in isolation. ## What changes are included in this PR? 24 end-to-end filter-pushdown tests are ported out of `filter_pushdown.rs` and deleted. The helpers `run_aggregate_dyn_filter_case` and `run_projection_dyn_filter_case` (and their supporting structs) are deleted along with the tests that used them. The 24 synchronous `#[test]` optimizer-rule-in-isolation tests are untouched — they stay in Rust because they specifically exercise `FilterPushdown::new()` / `OptimizationTest` over a hand-built plan. ### `datafusion/sqllogictest/test_files/push_down_filter_parquet.slt` New tests covering: - TopK dynamic filter pushdown integration (100k-row parquet, `max_row_group_size = 128`, asserting on `pushdown_rows_matched = 128` / `pushdown_rows_pruned = 99.87 K`) - TopK single-column and multi-column (compound-sort) dynamic filter shapes - HashJoin CollectLeft dynamic filter with `struct(a, b) IN (SET) ([...])` content - Nested hash joins propagating filters to both inner scans - Parent `WHERE` filter splitting across the two sides of a HashJoin - TopK above HashJoin, with both dynamic filters ANDed on the probe scan - Dynamic filter flowing through a `GROUP BY` sitting between a HashJoin and the probe scan - TopK projection rewrite — reorder, prune, expression, alias shadowing - NULL-bearing build-side join keys - `LEFT JOIN` and `LEFT SEMI JOIN` dynamic filter pushdown - HashTable strategy (`hash_lookup`) via `hash_join_inlist_pushdown_max_size = 1`, on both string and integer multi-column keys ### `datafusion/sqllogictest/test_files/push_down_filter_regression.slt` New tests covering: - Aggregate dynamic filter baseline: `MIN(a)`, `MAX(a)`, `MIN(a), MAX(a)`, `MIN(a), MAX(b)`, mixed `MIN/MAX` with an unsupported expression input, all-NULL input (filter stays `true`), `MIN(a+1)` (no filter emitted) - `WHERE` filter on a grouping column pushes through `AggregateExec` - `HAVING count(b) > 5` filter stays above the aggregate - End-to-end aggregate dynamic filter actually pruning a multi-file parquet scan The aggregate baseline tests run under `analyze_level = summary` + `analyze_categories = 'none'` so that metrics render empty and only the `predicate=DynamicFilter [ ... ]` content remains — the filter text is deterministic even though the pruning counts are subject to parallel-execution scheduling. ### What stayed in Rust Ten async tests now carry a short `// Not portable to sqllogictest: …` header explaining why. In short, they either: - Hand-wire `PartitionMode::Partitioned` or a `RepartitionExec` boundary that SQL never constructs for the sizes of data these tests use - Assert via debug-only APIs (`HashJoinExec::dynamic_filter_for_test().is_used()`, `ExecutionPlan::apply_expressions()` + `downcast_ref::<DynamicFilterPhysicalExpr>`) that are not observable from SQL - Target the specific stacked-`FilterExec` shape (#20109 regression) that the logical optimizer collapses before physical planning ## Are these changes tested? Yes — the ported tests _are_ the tests. Each ported slt case was generated with `cargo test -p datafusion-sqllogictest --test sqllogictests -- <file> --complete`, then re-run twice back-to-back without `--complete` to confirm determinism. The remaining Rust `filter_pushdown` tests continue to pass (`cargo test -p datafusion --test core_integration filter_pushdown` → 47 passed, 0 failed). `cargo clippy --tests -D warnings` and `cargo fmt --all` are clean. ## Test plan - [x] `cargo test -p datafusion-sqllogictest --test sqllogictests -- push_down_filter` - [x] `cargo test -p datafusion --test core_integration filter_pushdown` - [x] `cargo clippy -p datafusion --tests -- -D warnings` - [x] `cargo fmt --all` ## Are there any user-facing changes? No. This is a test-only refactor. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

## Summary - Adds `MetricCategory` enum (`Rows`, `Bytes`, `Timing`) classifying metrics by what they measure and, critically, their **determinism**: rows/bytes are deterministic given the same plan+data; timing varies across runs. - Each `Metric` can now declare its category via `MetricBuilder::with_category()`. Well-known builder methods (`output_rows`, `elapsed_compute`, `output_bytes`, etc.) set the category automatically. Custom counters/gauges default to "always included". - New session config `datafusion.explain.analyze_categories` accepts `all` (default), `none`, or comma-separated `rows`, `bytes`, `timing`. - This is orthogonal to the existing `analyze_level` (summary/dev) which controls verbosity. ## Motivation Running `EXPLAIN ANALYZE` in `.slt` tests currently requires liberal use of `<slt:ignore>` for every non-deterministic timing metric. With this change, a test can simply: ```sql SET datafusion.explain.analyze_categories = 'rows'; EXPLAIN ANALYZE SELECT ...; -- output contains only row-count metrics — fully deterministic, no <slt:ignore> needed ``` In particular, for dynamic filters we have relatively complex integration tests that exist mostly to assert the plan shapes and state of the dynamic filters after the plan has been executed. For example apache#21059. With this change I think most of those can be moved to SLT tests. I've also wanted to e.g. make assertions about pruning effectiveness without having timing information included. ## Test plan - [x] New Rust integration test `explain_analyze_categories` covering all combos (rows, none, all, rows+bytes) - [x] New `.slt` tests in `explain_analyze.slt` for `rows`, `none`, `rows,bytes`, and `rows` with dev level - [x] Existing `explain_analyze` integration tests pass (24/24) - [x] Proto roundtrip test updated and passing - [x] `information_schema` slt updated for new config entry - [x] Full `core_integration` suite passes (918 tests) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

## Which issue does this PR close?  - Closes #. ## Rationale for this change  Due to parallel commits' conflict in apache#21211 and apache#21160, the CI is failing, this PR fixes the issue. ## What changes are included in this PR?  ## Are these changes tested?  ## Are there any user-facing changes?

## Which issue does this PR close? - Closes #. ## Rationale for this change [apache#21160](apache#21160) added `datafusion.explain.analyze_categories`, which lets `EXPLAIN ANALYZE` emit only deterministic metric categories (e.g. `'rows'`). That unlocked a long-standing blocker on porting tests out of `datafusion/core/tests/physical_optimizer/filter_pushdown.rs`: previously these tests had to assert on execution state via `insta` snapshots over hand-wired `ExecutionPlan` trees and mock `TestSource` data, which kept them expensive to read, expensive to update, and impossible to test from the user-facing SQL path. With `analyze_categories = 'rows'`, the `predicate=DynamicFilter [ ... ]` text on a parquet scan is stable across runs, so the same invariants can now be expressed as plain `EXPLAIN ANALYZE` SQL in sqllogictest, where they are easier to read, easier to update, and exercise the full SQL → logical optimizer → physical optimizer → execution pipeline rather than a single optimizer rule in isolation. ## What changes are included in this PR? 24 end-to-end filter-pushdown tests are ported out of `filter_pushdown.rs` and deleted. The helpers `run_aggregate_dyn_filter_case` and `run_projection_dyn_filter_case` (and their supporting structs) are deleted along with the tests that used them. The 24 synchronous `#[test]` optimizer-rule-in-isolation tests are untouched — they stay in Rust because they specifically exercise `FilterPushdown::new()` / `OptimizationTest` over a hand-built plan. ### `datafusion/sqllogictest/test_files/push_down_filter_parquet.slt` New tests covering: - TopK dynamic filter pushdown integration (100k-row parquet, `max_row_group_size = 128`, asserting on `pushdown_rows_matched = 128` / `pushdown_rows_pruned = 99.87 K`) - TopK single-column and multi-column (compound-sort) dynamic filter shapes - HashJoin CollectLeft dynamic filter with `struct(a, b) IN (SET) ([...])` content - Nested hash joins propagating filters to both inner scans - Parent `WHERE` filter splitting across the two sides of a HashJoin - TopK above HashJoin, with both dynamic filters ANDed on the probe scan - Dynamic filter flowing through a `GROUP BY` sitting between a HashJoin and the probe scan - TopK projection rewrite — reorder, prune, expression, alias shadowing - NULL-bearing build-side join keys - `LEFT JOIN` and `LEFT SEMI JOIN` dynamic filter pushdown - HashTable strategy (`hash_lookup`) via `hash_join_inlist_pushdown_max_size = 1`, on both string and integer multi-column keys ### `datafusion/sqllogictest/test_files/push_down_filter_regression.slt` New tests covering: - Aggregate dynamic filter baseline: `MIN(a)`, `MAX(a)`, `MIN(a), MAX(a)`, `MIN(a), MAX(b)`, mixed `MIN/MAX` with an unsupported expression input, all-NULL input (filter stays `true`), `MIN(a+1)` (no filter emitted) - `WHERE` filter on a grouping column pushes through `AggregateExec` - `HAVING count(b) > 5` filter stays above the aggregate - End-to-end aggregate dynamic filter actually pruning a multi-file parquet scan The aggregate baseline tests run under `analyze_level = summary` + `analyze_categories = 'none'` so that metrics render empty and only the `predicate=DynamicFilter [ ... ]` content remains — the filter text is deterministic even though the pruning counts are subject to parallel-execution scheduling. ### What stayed in Rust Ten async tests now carry a short `// Not portable to sqllogictest: …` header explaining why. In short, they either: - Hand-wire `PartitionMode::Partitioned` or a `RepartitionExec` boundary that SQL never constructs for the sizes of data these tests use - Assert via debug-only APIs (`HashJoinExec::dynamic_filter_for_test().is_used()`, `ExecutionPlan::apply_expressions()` + `downcast_ref::<DynamicFilterPhysicalExpr>`) that are not observable from SQL - Target the specific stacked-`FilterExec` shape (apache#20109 regression) that the logical optimizer collapses before physical planning ## Are these changes tested? Yes — the ported tests _are_ the tests. Each ported slt case was generated with `cargo test -p datafusion-sqllogictest --test sqllogictests -- <file> --complete`, then re-run twice back-to-back without `--complete` to confirm determinism. The remaining Rust `filter_pushdown` tests continue to pass (`cargo test -p datafusion --test core_integration filter_pushdown` → 47 passed, 0 failed). `cargo clippy --tests -D warnings` and `cargo fmt --all` are clean. ## Test plan - [x] `cargo test -p datafusion-sqllogictest --test sqllogictests -- push_down_filter` - [x] `cargo test -p datafusion --test core_integration filter_pushdown` - [x] `cargo clippy -p datafusion --tests -- -D warnings` - [x] `cargo fmt --all` ## Are there any user-facing changes? No. This is a test-only refactor. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Extends DataFusion's `EXPLAIN` to accept a Postgres-style parenthesized option list alongside the existing keyword form, on dialects that enable it (the default `GenericDialect`, `PostgreSqlDialect`, `DuckDbDialect`, etc.). This surfaces the metric-category and verbosity knobs introduced in PR apache#21160 (currently only reachable via `SET`) directly in the statement, matching Postgres's one-liner ergonomics: EXPLAIN (ANALYZE, VERBOSE, METRICS 'rows,bytes', LEVEL dev) SELECT ... Options recognized: `ANALYZE`, `VERBOSE`, `FORMAT`, `METRICS`, `LEVEL`, `TIMING`, `SUMMARY`, `COSTS`. Statement-level values override the corresponding session config. Postgres-only options that DataFusion does not model (`BUFFERS`, `WAL`, `SETTINGS`, `GENERIC_PLAN`, `MEMORY`) return a clear unsupported-option error rather than silently accepting them. The legacy keyword form (`EXPLAIN ANALYZE VERBOSE FORMAT tree ...`) is unchanged. Parser delegates to sqlparser's `parse_utility_options()` under the dialect gate; a new `ExplainStatementOptions` struct in `datafusion-common` normalizes both forms into a single representation that flows through `explain_to_plan` into the `Analyze` / `Explain` logical plan nodes. `handle_analyze` / `handle_explain` in the physical planner prefer statement-level overrides over session config before constructing `AnalyzeExec` / `ExplainExec`. Proto serialization of the new fields is left as a follow-up (TODO comments in `datafusion/proto/src/logical_plan/mod.rs`); fields default to `None` on the other side, matching prior behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2010YOUY01 reviewed Mar 26, 2026

View reviewed changes

adriangb and others added 7 commits March 26, 2026 15:16

fix

701aab9

fix configs

c290b2a

fixes

010dc6c

update json

b13a270

adriangb force-pushed the explain-analyze-metric-categories branch from 90963d4 to b13a270 Compare March 26, 2026 20:26

github-actions Bot added the documentation Improvements or additions to documentation label Mar 26, 2026

refactor

5cf7348

adriangb force-pushed the explain-analyze-metric-categories branch from dc72f7a to 5cf7348 Compare March 26, 2026 23:38

adriangb commented Mar 27, 2026

View reviewed changes

2010YOUY01 reviewed Mar 27, 2026

View reviewed changes

2010YOUY01 approved these changes Mar 27, 2026

View reviewed changes

2010YOUY01 added this pull request to the merge queue Mar 28, 2026

Merged via the queue into apache:main with commit 2c03881 Mar 28, 2026
69 checks passed

2010YOUY01 mentioned this pull request Mar 30, 2026

fix: Fix main compilation failure #21242

Merged

This was referenced Mar 30, 2026

fix: broken main branch due to merge race in #21211 #21243

Closed

Fix FilterExec tree render missing fetch display #21230

Merged

adriangb mentioned this pull request Apr 14, 2026

Port filter_pushdown.rs async tests to sqllogictest #21620

Merged

4 tasks

adriangb mentioned this pull request Apr 21, 2026

feat(sql): Postgres-style EXPLAIN (...) option list #21768

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add metric category filtering for EXPLAIN ANALYZE#21160

Add metric category filtering for EXPLAIN ANALYZE#21160
2010YOUY01 merged 8 commits intoapache:mainfrom
pydantic:explain-analyze-metric-categories

adriangb commented Mar 25, 2026 •

edited

Loading

Uh oh!

2010YOUY01 Mar 26, 2026

Uh oh!

adriangb Mar 27, 2026

Uh oh!

adriangb Mar 27, 2026 •

edited

Loading

Uh oh!

2010YOUY01 Mar 27, 2026

Uh oh!

2010YOUY01 left a comment

Uh oh!

2010YOUY01 Mar 27, 2026

Uh oh!

2010YOUY01 Mar 27, 2026

Uh oh!

adriangb Mar 27, 2026

Uh oh!

2010YOUY01 Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

adriangb commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Test plan

Uh oh!

2010YOUY01 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

adriangb Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

adriangb Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

2010YOUY01 Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

2010YOUY01 left a comment

Choose a reason for hiding this comment

Uh oh!

2010YOUY01 Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

2010YOUY01 Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

adriangb Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

2010YOUY01 Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adriangb commented Mar 25, 2026 •

edited

Loading

adriangb Mar 27, 2026 •

edited

Loading