Skip to content

feat: persist __time min/max in V10 ProjectionMetadata#19398

Open
clintropolis wants to merge 2 commits intoapache:masterfrom
clintropolis:v10-segment-save-min-max
Open

feat: persist __time min/max in V10 ProjectionMetadata#19398
clintropolis wants to merge 2 commits intoapache:masterfrom
clintropolis:v10-segment-save-min-max

Conversation

@clintropolis
Copy link
Copy Markdown
Member

Description

changes:

  • add nullable minTime/maxTime Long fields to ProjectionMetadata alongside the existing numRows
  • track min/max __time in IndexMergerBase.mergeIndexesAndWriteColumns via timestampSelector.getLong() on each row; surface via IndexMergeResult (null when zero rows are walked)
  • IndexMergerV10 wires the merge result's values into the base-table ProjectionMetadata; aggregate projections leave the new fields null for now, though we can easily add this in future work if useful
  • tracking is done regardless of segment sort order, non-time-sorted segments (DimensionsSpec.forceSegmentSortByTime = false) also store these values accurately (just with less utility since we can't use the information to skip rows, though we could use them to know if there are no rows within the time range)
  • write-side only: fields are persisted but not yet consumed. A follow-up will add a partial V10-aware TimeBoundaryInspector that reads them

changes:
* add nullable `minTime`/`maxTime` Long fields to `ProjectionMetadata` alongside the existing `numRows`
* track min/max __time in `IndexMergerBase.mergeIndexesAndWriteColumns` via `timestampSelector.getLong()` on each row; surface via `IndexMergeResult` (null when zero rows are walked)
* `IndexMergerV10` wires the merge result's values into the base-table `ProjectionMetadata`; aggregate projections leave the new fields null for now, though we can easily add this in future work if useful
* tracking is done regardless of segment sort order, non-time-sorted segments (`DimensionsSpec.forceSegmentSortByTime` = false) also store these values accurately (just with less utility since we can't use the information to skip rows, though we could use them to know if there are no rows within the time range)
* write-side only: fields are persisted but not yet consumed. A follow-up will add a partial V10-aware `TimeBoundaryInspector` that reads them
Copy link
Copy Markdown
Member

@FrankChen021 FrankChen021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the code for correctness, edge cases, concurrency, and integration risks; no issues found.


This is an automated review by Codex GPT-5

Copy link
Copy Markdown
Member

@FrankChen021 FrankChen021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the code for correctness, edge cases, concurrency, and integration risks; no issues found.


This is an automated review by Codex GPT-5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants