Skip to content

feat: Wire format and data model for partial-projection load rules#19409

Open
clintropolis wants to merge 2 commits intoapache:masterfrom
clintropolis:partial-load-rules-coordinator-stuff
Open

feat: Wire format and data model for partial-projection load rules#19409
clintropolis wants to merge 2 commits intoapache:masterfrom
clintropolis:partial-load-rules-coordinator-stuff

Conversation

@clintropolis
Copy link
Copy Markdown
Member

Description

Follow-up to #19374, this PR lays the wire-format groundwork for partial load rules. Coordinators can now send wrapped load specs to historicals carrying scheme-specific partial-load metadata, historicals can advertise whether they understand the wrapper family, and a generic PartialLoadProfile data model threads the metadata through the load queue. The actual coordinator assignment logic (fingerprint-aware replica counting, load-queue plumbing, inventory tracking, and the historical-side partial-load path) is intentionally deferred to follow-up PRs.

changes:

  • add PartialProjectionLoadSpec, a LoadSpec wrapper for partial-projection loads with lazy delegate materialization
  • add PartialLoadSpecModule and register it in CoreInjectorBuilder so historicals always have the wrapper
  • add supportsPartialLoad capability flag to SegmentLoadingCapabilities; historicals advertise true, coordinator defaults to false for unknown servers
  • add optional fingerprint and loadedBytes wire fields to SegmentChangeRequestLoad for historical partial-load announcements (NON_NULL include)
  • add PartialLoadProfile generic data model (wrappedLoadSpec, fingerprint, loadedBytes) with forRequest / forLoaded / forFullFallback factories
  • add SegmentActionHandler.replicateSegmentPartially default throwing UnsupportedOperationException
  • wire PartialLoadRule.run() to route through replicateSegmentPartially when the matcher resolves; fall back to replicateSegment for FULL_LOAD-on-cannot-match

changes:

* add `PartialProjectionLoadSpec`, a `LoadSpec` wrapper for partial-projection loads with lazy delegate materialization
* add `PartialLoadSpecModule` and register it in `CoreInjectorBuilder` so historicals always have the wrapper
* add `supportsPartialLoad` capability flag to `SegmentLoadingCapabilities`; historicals advertise true, coordinator defaults to false for unknown servers
* add optional `fingerprint` and `loadedBytes` wire fields to `SegmentChangeRequestLoad` for historical partial-load announcements (NON_NULL include)
* add `PartialLoadProfile` generic data model (wrappedLoadSpec, fingerprint, loadedBytes) with `forRequest` / `forLoaded` / `forFullFallback` factories
* add `SegmentActionHandler.replicateSegmentPartially` default throwing `UnsupportedOperationException`
* wire `PartialLoadRule.run()` to route through `replicateSegmentPartially` when the matcher resolves; fall back to `replicateSegment` for FULL_LOAD-on-cannot-match
* {@code StrategicSegmentAssigner} overrides this to do fingerprint-aware replica counting.
*/
default void replicateSegmentPartially(
DataSegment segment,
Comment on lines +86 to +96
DataSegment segment = new DataSegment(
"ds",
Intervals.of("2024-01-01/2024-02-01"),
"v1",
Map.of("type", "local"),
List.of("d"),
List.of("m"),
NoneShardSpec.instance(),
IndexIO.CURRENT_VERSION_ID,
1
);
Comment on lines +111 to +121
DataSegment segment = new DataSegment(
"ds",
Intervals.of("2024-01-01/2024-02-01"),
"v1",
Map.of("type", "local"),
List.of("d"),
List.of("m"),
NoneShardSpec.instance(),
IndexIO.CURRENT_VERSION_ID,
1
);
Comment on lines +139 to +149
DataSegment segment = new DataSegment(
"ds",
Intervals.of("2024-01-01/2024-02-01"),
"v1",
Map.of("type", "local"),
List.of("d"),
List.of("m"),
NoneShardSpec.instance(),
IndexIO.CURRENT_VERSION_ID,
1
);
Copy link
Copy Markdown
Member

@FrankChen021 FrankChen021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity Findings
P0 0
P1 1
P2 0
P3 0
Total 1

This is an automated review by Codex GPT-5

if (result != null) {
// Matcher resolved: route through the partial-load handler. The wrappedLoadSpec map carries scheme-specific
// data that the historical-side wrapper deserializes.
handler.replicateSegmentPartially(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Matched partial rules now call an unimplemented production handler

This changes matched partial-load rules from the prior full-load behavior to handler.replicateSegmentPartially(...), but the production StrategicSegmentAssigner does not override that method; the only implementation is the new SegmentActionHandler default, which throws UnsupportedOperationException. Since RunRules passes a StrategicSegmentAssigner directly and does not catch this per rule, any configured partial-load rule whose matcher resolves for a segment can abort the coordinator rules duty instead of loading or falling back. Either keep the previous full-load path until assignment is implemented, or add a production override that degrades safely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants