|
| 1 | +# Smart Category Merging for DLQ Resubmission |
| 2 | + |
| 3 | +## The Problem |
| 4 | + |
| 5 | +When working with Azure Service Bus dead-letter queues (DLQ) interactively, messages are grouped by their `Subject` (label) and `DeadLetterReason` into categories. This works well when error messages are static: |
| 6 | + |
| 7 | +| # | Label | Dead Letter Reason | Count | |
| 8 | +|---|----------------------|---------------------------|-------| |
| 9 | +| 1 | OrderProcessor | MaxDeliveryCountExceeded | 47 | |
| 10 | +| 2 | PaymentHandler | TimeoutExceeded | 23 | |
| 11 | + |
| 12 | +But many real-world error messages contain parameterized values — GUIDs, IDs, timestamps, names, sequence numbers. Each unique value creates its own category: |
| 13 | + |
| 14 | +| # | Label | Dead Letter Reason | Count | |
| 15 | +|---|------------------------------------------------------------------------|---------------------------|-------| |
| 16 | +| 1 | Could not create user with ID 3cefe1dd-91a0-490d-adfe-dc569472f6e9 | MaxDeliveryCountExceeded | 1 | |
| 17 | +| 2 | Could not create user with ID aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee | MaxDeliveryCountExceeded | 1 | |
| 18 | +| 3 | Could not create user with ID 11111111-2222-3333-4444-555555555555 | MaxDeliveryCountExceeded | 1 | |
| 19 | +| ... | ... (hundreds more) | | | |
| 20 | + |
| 21 | +This makes interactive category selection useless — you'd have to select hundreds of individual entries that are logically the same error. |
| 22 | + |
| 23 | +## The Solution: `--merge-similar` |
| 24 | + |
| 25 | +The `--merge-similar` flag uses LCS-based dynamic clustering to detect parameterized values and merge similar categories: |
| 26 | + |
| 27 | +``` |
| 28 | +dotnet run -- resubmit-dlq -n mynamespace.servicebus.windows.net -q myqueue -i --merge-similar |
| 29 | +``` |
| 30 | + |
| 31 | +| # | Label | Dead Letter Reason | Count | |
| 32 | +|---|------------------------------------------|---------------------------|-------| |
| 33 | +| 1 | Could not create user with ID * | MaxDeliveryCountExceeded | 247 | |
| 34 | +| 2 | OrderProcessor | TimeoutExceeded | 23 | |
| 35 | + |
| 36 | +Now selecting category 1 resubmits all 247 messages, regardless of which specific GUID appeared in each. |
| 37 | + |
| 38 | +## Algorithm: LCS-Based Greedy Clustering |
| 39 | + |
| 40 | +### Core Idea |
| 41 | + |
| 42 | +Instead of hardcoded regex patterns, the algorithm **compares actual category data** to dynamically detect parameterized values. A **template** is defined by its **frame** — an ordered list of literal tokens common to all members. A `*` wildcard sits implicitly between, before, and after each frame token, matching any number of tokens (including zero). |
| 43 | + |
| 44 | +Example: frame `[Error, for, user, in, region]` produces the template `"Error * for user * in region *"`, matching: |
| 45 | +- `"Error 123 for user Bob in region us-east"` |
| 46 | +- `"Error 456 for user 'Alice Smith' in region eu-west-1"` |
| 47 | + |
| 48 | +This handles **variable-length parameters** (e.g., multi-word names) that fixed-pattern approaches cannot. |
| 49 | + |
| 50 | +### Algorithm Steps |
| 51 | + |
| 52 | +1. **Tokenize** each category's Label and DeadLetterReason by whitespace |
| 53 | +2. **Sort** categories by count descending (high-frequency categories form better initial templates), then by token count descending |
| 54 | +3. **Greedy clustering**: For each category C: |
| 55 | + - For each existing template T, compute `LCS(T.labelFrame, C.labelTokens)` and `LCS(T.reasonFrame, C.reasonTokens)` |
| 56 | + - Score each field: `score = lcsLen / max(frameLen, tokensLen)` |
| 57 | + - Match if **both** `labelScore >= 0.5` **and** `reasonScore >= 0.5` |
| 58 | + - Pick the best-scoring match above threshold |
| 59 | + - If matched: shrink T's frames to the new LCS, add C to T's group |
| 60 | + - If not matched: create a new singleton template from C |
| 61 | +4. **Post-processing**: |
| 62 | + - Templates with 1 member → no merging, emit as-is |
| 63 | + - Safety rule: frame must have ≥ 1 token (combined label + reason) to merge |
| 64 | +5. **Render** display templates by analyzing gap positions across all members |
| 65 | + |
| 66 | +### Scoring |
| 67 | + |
| 68 | +Using `max(frameLen, tokensLen)` as the denominator ensures the LCS must cover a good portion of both the frame and the candidate string. The threshold of **0.5** means at least half the tokens must be shared. |
| 69 | + |
| 70 | +For empty token sequences (both sides empty), the score is 1.0 (perfect match). |
| 71 | + |
| 72 | +### Frame Shrinking |
| 73 | + |
| 74 | +When we shrink a frame (take LCS of old frame with new member), the new frame is a subsequence of the old frame. The old frame was a subsequence of all previous members. By transitivity, the new frame is still a subsequence of all members. |
| 75 | + |
| 76 | +### Template Rendering |
| 77 | + |
| 78 | +Given a frame and all member token sequences: |
| 79 | + |
| 80 | +1. For each member, align the frame against the member's tokens (greedy subsequence alignment) |
| 81 | +2. For each gap between consecutive frame tokens (and before/after), check if **any** member has extra tokens |
| 82 | +3. Insert `*` at gaps where content exists |
| 83 | + |
| 84 | +Example: frame `[User, is, not, valid]`, members `["User 'John Smith' is not valid", "User 'Bob' is not valid"]`: |
| 85 | +- Gap "User" → "is": both members have extra tokens → insert `*` |
| 86 | +- All other gaps: empty → no `*` |
| 87 | +- **Template: `"User * is not valid"`** |
| 88 | + |
| 89 | +### Examples |
| 90 | + |
| 91 | +| Input categories | Result | |
| 92 | +|---|---| |
| 93 | +| `"User 'John Smith' is not valid"`, `"User 'Bob' is not valid"` | `"User * is not valid"` | |
| 94 | +| `"Error 1 for user 'John Smith' in region us-east"`, `"Error 2 for user 'Bob' in region eu-west"` | `"Error * for user * in region *"` | |
| 95 | +| `"Could not create user with ID <guid1>"`, `"...with ID <guid2>"` | `"Could not create user with ID *"` | |
| 96 | +| `"OrderProcessor"`, `"PaymentHandler"` | Kept separate (no common tokens) | |
| 97 | + |
| 98 | +### Performance |
| 99 | + |
| 100 | +- LCS computation: O(m × n) per comparison via DP |
| 101 | +- Total: O(K × N × L²) where K = templates (~20), N = categories (~1000), L = avg tokens (~10) |
| 102 | +- Completes in microseconds for typical DLQ data |
| 103 | + |
| 104 | +## Merge + Expand Pattern |
| 105 | + |
| 106 | +The key architectural insight is that merging is **only a display/grouping concern**. The downstream message filtering (`SnapshotForCategories`) still uses exact-match `DlqCategoryKey` values. |
| 107 | + |
| 108 | +The flow: |
| 109 | + |
| 110 | +1. **Merge**: Cluster categories by LCS similarity, sum counts, and build a `MergeMap` (merged key → set of original keys) |
| 111 | +2. **Display**: Show merged categories to the user in the interactive table |
| 112 | +3. **Select**: User picks merged categories by index number |
| 113 | +4. **Expand**: Convert selected merged keys back to all original keys via `ExpandKeys` |
| 114 | +5. **Filter**: Pass expanded original keys to the existing `SnapshotForCategories` method |
| 115 | + |
| 116 | +``` |
| 117 | +Original categories Merged categories User selects |
| 118 | +┌─────────────────────┐ ┌────────────────────┐ merged #1 |
| 119 | +│ Error with ID abc │────▸│ Error with ID * │─────────┐ |
| 120 | +│ Error with ID def │────▸│ (count: 3) │ │ |
| 121 | +│ Error with ID ghi │────▸│ │ │ |
| 122 | +└─────────────────────┘ └────────────────────┘ │ |
| 123 | + ▼ |
| 124 | + ExpandKeys() Filter with |
| 125 | + ┌────────────────────┐ original keys |
| 126 | + │ {abc, def, ghi} │───▸ exact match |
| 127 | + └────────────────────┘ |
| 128 | +``` |
| 129 | + |
| 130 | +## Files |
| 131 | + |
| 132 | +- `CategoryMerger.cs` — Static utility with LCS-based `Merge` algorithm |
| 133 | +- `CategoryMergeResult.cs` — Result record with `ExpandKeys` method |
| 134 | +- `StreamDlqCategories.cs` — Threading the `MergeSimilar` flag through command → snapshot |
| 135 | +- `ResubmitDlqCommandHandler.cs` — Wiring up key expansion in the interactive flow |
| 136 | +- `ResubmitDlqCliCommand.cs` — The `--merge-similar` CLI option |
0 commit comments