[VQueues] Provisions to add user limits to datafusion#4615
[VQueues] Provisions to add user limits to datafusion#4615AhmedSoliman wants to merge 5 commits intomainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1d7a6a97ce
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| sys_vqueues_sort_order(), | ||
| remote_scanner_manager.create_distributed_scanner(NAME, local_scanner), |
There was a problem hiding this comment.
Remove misleading sort-order metadata from sys_vqueues
The provider declares sys_vqueues_sort_order() as a guaranteed physical ordering, but VQueuesScanner::for_each_row scans per-stage batches and runs multiple stage scans concurrently (stages.chunks(...) + try_join_all(...)), so rows from different stages can interleave and are not globally ordered by partition_key,id,stage,.... Because DataFusion uses declared ordering to optimize/suppress sorts, this can produce incorrectly ordered results for plans that rely on that metadata (for example ORDER BY and merge-based plans).
Useful? React with 👍 / 👎.
|
|
||
| use datafusion::arrow::datatypes::DataType; | ||
|
|
||
| define_sort_order!(sys_vqueue_stats(partition_key, scope, lock_name)); |
There was a problem hiding this comment.
Align sys_vqueue_stats sort order with actual scan order
sys_vqueue_stats advertises (partition_key, scope, lock_name) as its physical sort order, but rows are produced by for_each_vqueue_meta over metadata keys keyed by VQueueId, not by scope/lock_name. This means the optimizer can assume an ordering that the stream does not satisfy, which risks incorrect sort-elision decisions; the declared order should match the real key traversal (or be omitted).
Useful? React with 👍 / 👎.
2b4e42b to
c4fb7a0
Compare
9ae5df0 to
81b807c
Compare
Expose vqueue entries as a single DataFusion table with stage-aware scanning. When the query filters `stage`, only matching stage key kinds are scanned; without a stage filter, all inbox stages are scanned and merged. Also project the latest entry metadata for observability (status plus EntryStatistics timestamps and counters), and add targeted tests for stage predicate extraction and sys_vqueues stage filtering behavior.
Stack created with Sapling. Best reviewed with ReviewStack.