Skip to content

[VQueues] Move scheduler status types to worker-api#4638

Merged
AhmedSoliman merged 2 commits intomainfrom
pr4638
Apr 24, 2026
Merged

[VQueues] Move scheduler status types to worker-api#4638
AhmedSoliman merged 2 commits intomainfrom
pr4638

Conversation

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 31e4a357a4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/storage-query-datafusion/src/vqueue_meta/table.rs Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 23, 2026

Test Results

  8 files  ± 0    8 suites  ±0   5m 36s ⏱️ + 3m 38s
 53 tests + 3   53 ✅ + 3  0 💤 ±0  0 ❌ ±0 
233 runs  +15  233 ✅ +15  0 💤 ±0  0 ❌ ±0 

Results for commit e312a76. ± Comparison against base commit 553ca52.

This pull request removes 1 and adds 4 tests. Note that renamed tests count towards both.
dev.restate.sdktesting.tests.Custom ‑ run(CustomTestConfig, URI, URI)[1]
dev.restate.sdktesting.tests.KafkaIngress ‑ handleEventInCounterService(URI, int, Client)
dev.restate.sdktesting.tests.KafkaIngress ‑ handleEventInEventHandler(URI, int, Client)
dev.restate.sdktesting.tests.UpgradeWithInFlightInvocation ‑ inFlightInvocation(Client, URI)
dev.restate.sdktesting.tests.UpgradeWithNewInvocation ‑ executesNewInvocationWithLatestServiceRevisions(Client, URI)

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Contributor

@muhamadazmy muhamadazmy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢

Copy link
Copy Markdown
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. +1 for merging :-)

Comment on lines -93 to -94
/// The vqueue is waiting to acquire a lock of a VO.
BlockedOnLock,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this SchedulingStatus redundant given that we have ResourceKind::Lock?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in subsequent PR

Comment on lines -95 to -98
/// The vqueue is waiting for concurrency tokens. Concurrency tokens are released
/// when currently running items are completed or (in some cases) when running items
/// are parked.
WaitingConcurrencyTokens,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here.

…time accounting

## sys_vqueue_meta

- New partitioned datafusion table surfacing per-vqueue identity, lifecycle flags/timestamps, EMA durations, and per-stage counts from `VQueueMeta`.
- New `ScanVQueueMetaTable` trait in `storage-api`, implemented on `PartitionStore` at `Priority::Low`; decode uses `VQueueMetaRef` via bilrost `decode_borrowed` for zero-copy string columns.

## Wait-time accounting

- `Stats` rewritten around a single `Option<(WaitBucket, Instant)>` + `EnumMap` accumulator. *At most one open wait segment* is now a structural invariant, fixing over-attribution on resource transitions.
- One `set_wait()` entry point; `finalize()` / `snapshot()` share one builder; `reset()` flushes open segments so preempted items still contribute to node-wide counters.
- `ResourceKind -> WaitBucket` mapping is exhaustive — new variants are a compile error until a bucket is assigned.
- Node-level invoker throttling is one bucket whether observed as `BlockedOn` or as a token-bucket delay.

## EMAs

- Two new `VQueueStatistics` fields sampled on **every** `Inbox -> Running` transition (retries included): `avg_blocked_on_concurrency_rules_ms`, `avg_blocked_on_invoker_throttling_ms`.
- Carried through `MoveMetrics`; surfaced as two `Duration` columns on `sys_vqueue_meta`.
- Aspirational per-entry wait-time todo in `EntryStatistics` removed — vqueue-level EMA preferred over per-entry signal that decays in relevance.

## Naming

- `WaitStats` fields uniformly use `blocked_on_*_ms`; metric constants/strings follow `VQUEUE_<X>_WAIT_MS` / `restate.vqueue.scheduler.<x>_wait_ms.total`; `WaitBucket` and `ResourceKind` variants use matching names.
- Pre-existing metric-string / describe-text inconsistencies fixed.
- Bilrost tag numbers preserved on renamed fields — wire-compatible.
This is to allow datafusion storage engine to access these types without direct dependency on restate-vqueues.
@AhmedSoliman AhmedSoliman merged commit 576361b into main Apr 24, 2026
39 of 40 checks passed
@AhmedSoliman AhmedSoliman deleted the pr4638 branch April 24, 2026 12:08
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 24, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants