feat: Wire format and data model for partial-projection load rules#19409
feat: Wire format and data model for partial-projection load rules#19409clintropolis wants to merge 2 commits intoapache:masterfrom
Conversation
changes: * add `PartialProjectionLoadSpec`, a `LoadSpec` wrapper for partial-projection loads with lazy delegate materialization * add `PartialLoadSpecModule` and register it in `CoreInjectorBuilder` so historicals always have the wrapper * add `supportsPartialLoad` capability flag to `SegmentLoadingCapabilities`; historicals advertise true, coordinator defaults to false for unknown servers * add optional `fingerprint` and `loadedBytes` wire fields to `SegmentChangeRequestLoad` for historical partial-load announcements (NON_NULL include) * add `PartialLoadProfile` generic data model (wrappedLoadSpec, fingerprint, loadedBytes) with `forRequest` / `forLoaded` / `forFullFallback` factories * add `SegmentActionHandler.replicateSegmentPartially` default throwing `UnsupportedOperationException` * wire `PartialLoadRule.run()` to route through `replicateSegmentPartially` when the matcher resolves; fall back to `replicateSegment` for FULL_LOAD-on-cannot-match
| * {@code StrategicSegmentAssigner} overrides this to do fingerprint-aware replica counting. | ||
| */ | ||
| default void replicateSegmentPartially( | ||
| DataSegment segment, |
| DataSegment segment = new DataSegment( | ||
| "ds", | ||
| Intervals.of("2024-01-01/2024-02-01"), | ||
| "v1", | ||
| Map.of("type", "local"), | ||
| List.of("d"), | ||
| List.of("m"), | ||
| NoneShardSpec.instance(), | ||
| IndexIO.CURRENT_VERSION_ID, | ||
| 1 | ||
| ); |
| DataSegment segment = new DataSegment( | ||
| "ds", | ||
| Intervals.of("2024-01-01/2024-02-01"), | ||
| "v1", | ||
| Map.of("type", "local"), | ||
| List.of("d"), | ||
| List.of("m"), | ||
| NoneShardSpec.instance(), | ||
| IndexIO.CURRENT_VERSION_ID, | ||
| 1 | ||
| ); |
| DataSegment segment = new DataSegment( | ||
| "ds", | ||
| Intervals.of("2024-01-01/2024-02-01"), | ||
| "v1", | ||
| Map.of("type", "local"), | ||
| List.of("d"), | ||
| List.of("m"), | ||
| NoneShardSpec.instance(), | ||
| IndexIO.CURRENT_VERSION_ID, | ||
| 1 | ||
| ); |
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 1 |
| P2 | 0 |
| P3 | 0 |
| Total | 1 |
This is an automated review by Codex GPT-5
| if (result != null) { | ||
| // Matcher resolved: route through the partial-load handler. The wrappedLoadSpec map carries scheme-specific | ||
| // data that the historical-side wrapper deserializes. | ||
| handler.replicateSegmentPartially( |
There was a problem hiding this comment.
[P1] Matched partial rules now call an unimplemented production handler
This changes matched partial-load rules from the prior full-load behavior to handler.replicateSegmentPartially(...), but the production StrategicSegmentAssigner does not override that method; the only implementation is the new SegmentActionHandler default, which throws UnsupportedOperationException. Since RunRules passes a StrategicSegmentAssigner directly and does not catch this per rule, any configured partial-load rule whose matcher resolves for a segment can abort the coordinator rules duty instead of loading or falling back. Either keep the previous full-load path until assignment is implemented, or add a production override that degrades safely.
Description
Follow-up to #19374, this PR lays the wire-format groundwork for partial load rules. Coordinators can now send wrapped load specs to historicals carrying scheme-specific partial-load metadata, historicals can advertise whether they understand the wrapper family, and a generic
PartialLoadProfiledata model threads the metadata through the load queue. The actual coordinator assignment logic (fingerprint-aware replica counting, load-queue plumbing, inventory tracking, and the historical-side partial-load path) is intentionally deferred to follow-up PRs.changes:
PartialProjectionLoadSpec, aLoadSpecwrapper for partial-projection loads with lazy delegate materializationPartialLoadSpecModuleand register it inCoreInjectorBuilderso historicals always have the wrappersupportsPartialLoadcapability flag toSegmentLoadingCapabilities; historicals advertise true, coordinator defaults to false for unknown serversfingerprintandloadedByteswire fields toSegmentChangeRequestLoadfor historical partial-load announcements (NON_NULL include)PartialLoadProfilegeneric data model (wrappedLoadSpec, fingerprint, loadedBytes) withforRequest/forLoaded/forFullFallbackfactoriesSegmentActionHandler.replicateSegmentPartiallydefault throwingUnsupportedOperationExceptionPartialLoadRule.run()to route throughreplicateSegmentPartiallywhen the matcher resolves; fall back toreplicateSegmentfor FULL_LOAD-on-cannot-match