Skip to content

Commit bf8ea07

Browse files
authored
Dead-letter smart categorization (#8)
* Implemented smart category merging * Implemented integration tests * CLI integraiton tests * Moved the doc to articles
1 parent a52ee59 commit bf8ea07

21 files changed

Lines changed: 1616 additions & 216 deletions

File tree

ServiceBusToolset.slnx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616
</Folder>
1717
<Folder Name="/tests/">
1818
<Project Path="tests/ServiceBusToolset.Application.Tests/ServiceBusToolset.Application.Tests.csproj"/>
19+
<Project Path="tests/ServiceBusToolset.IntegrationTesting/ServiceBusToolset.IntegrationTesting.csproj"/>
1920
<Project Path="tests/ServiceBusToolset.Integration.Tests/ServiceBusToolset.Integration.Tests.csproj"/>
21+
<Project Path="tests/ServiceBusToolset.CLI.Integration.Tests/ServiceBusToolset.CLI.Integration.Tests.csproj"/>
2022
</Folder>
2123
</Solution>
Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
# Smart Category Merging for DLQ Resubmission
2+
3+
## The Problem
4+
5+
When working with Azure Service Bus dead-letter queues (DLQ) interactively, messages are grouped by their `Subject` (label) and `DeadLetterReason` into categories. This works well when error messages are static:
6+
7+
| # | Label | Dead Letter Reason | Count |
8+
|---|----------------------|---------------------------|-------|
9+
| 1 | OrderProcessor | MaxDeliveryCountExceeded | 47 |
10+
| 2 | PaymentHandler | TimeoutExceeded | 23 |
11+
12+
But many real-world error messages contain parameterized values — GUIDs, IDs, timestamps, names, sequence numbers. Each unique value creates its own category:
13+
14+
| # | Label | Dead Letter Reason | Count |
15+
|---|------------------------------------------------------------------------|---------------------------|-------|
16+
| 1 | Could not create user with ID 3cefe1dd-91a0-490d-adfe-dc569472f6e9 | MaxDeliveryCountExceeded | 1 |
17+
| 2 | Could not create user with ID aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee | MaxDeliveryCountExceeded | 1 |
18+
| 3 | Could not create user with ID 11111111-2222-3333-4444-555555555555 | MaxDeliveryCountExceeded | 1 |
19+
| ... | ... (hundreds more) | | |
20+
21+
This makes interactive category selection useless — you'd have to select hundreds of individual entries that are logically the same error.
22+
23+
## The Solution: `--merge-similar`
24+
25+
The `--merge-similar` flag uses LCS-based dynamic clustering to detect parameterized values and merge similar categories:
26+
27+
```
28+
dotnet run -- resubmit-dlq -n mynamespace.servicebus.windows.net -q myqueue -i --merge-similar
29+
```
30+
31+
| # | Label | Dead Letter Reason | Count |
32+
|---|------------------------------------------|---------------------------|-------|
33+
| 1 | Could not create user with ID * | MaxDeliveryCountExceeded | 247 |
34+
| 2 | OrderProcessor | TimeoutExceeded | 23 |
35+
36+
Now selecting category 1 resubmits all 247 messages, regardless of which specific GUID appeared in each.
37+
38+
## Algorithm: LCS-Based Greedy Clustering
39+
40+
### Core Idea
41+
42+
Instead of hardcoded regex patterns, the algorithm **compares actual category data** to dynamically detect parameterized values. A **template** is defined by its **frame** — an ordered list of literal tokens common to all members. A `*` wildcard sits implicitly between, before, and after each frame token, matching any number of tokens (including zero).
43+
44+
Example: frame `[Error, for, user, in, region]` produces the template `"Error * for user * in region *"`, matching:
45+
- `"Error 123 for user Bob in region us-east"`
46+
- `"Error 456 for user 'Alice Smith' in region eu-west-1"`
47+
48+
This handles **variable-length parameters** (e.g., multi-word names) that fixed-pattern approaches cannot.
49+
50+
### Algorithm Steps
51+
52+
1. **Tokenize** each category's Label and DeadLetterReason by whitespace
53+
2. **Sort** categories by count descending (high-frequency categories form better initial templates), then by token count descending
54+
3. **Greedy clustering**: For each category C:
55+
- For each existing template T, compute `LCS(T.labelFrame, C.labelTokens)` and `LCS(T.reasonFrame, C.reasonTokens)`
56+
- Score each field: `score = lcsLen / max(frameLen, tokensLen)`
57+
- Match if **both** `labelScore >= 0.5` **and** `reasonScore >= 0.5`
58+
- Pick the best-scoring match above threshold
59+
- If matched: shrink T's frames to the new LCS, add C to T's group
60+
- If not matched: create a new singleton template from C
61+
4. **Post-processing**:
62+
- Templates with 1 member → no merging, emit as-is
63+
- Safety rule: frame must have ≥ 1 token (combined label + reason) to merge
64+
5. **Render** display templates by analyzing gap positions across all members
65+
66+
### Scoring
67+
68+
Using `max(frameLen, tokensLen)` as the denominator ensures the LCS must cover a good portion of both the frame and the candidate string. The threshold of **0.5** means at least half the tokens must be shared.
69+
70+
For empty token sequences (both sides empty), the score is 1.0 (perfect match).
71+
72+
### Frame Shrinking
73+
74+
When we shrink a frame (take LCS of old frame with new member), the new frame is a subsequence of the old frame. The old frame was a subsequence of all previous members. By transitivity, the new frame is still a subsequence of all members.
75+
76+
### Template Rendering
77+
78+
Given a frame and all member token sequences:
79+
80+
1. For each member, align the frame against the member's tokens (greedy subsequence alignment)
81+
2. For each gap between consecutive frame tokens (and before/after), check if **any** member has extra tokens
82+
3. Insert `*` at gaps where content exists
83+
84+
Example: frame `[User, is, not, valid]`, members `["User 'John Smith' is not valid", "User 'Bob' is not valid"]`:
85+
- Gap "User" → "is": both members have extra tokens → insert `*`
86+
- All other gaps: empty → no `*`
87+
- **Template: `"User * is not valid"`**
88+
89+
### Examples
90+
91+
| Input categories | Result |
92+
|---|---|
93+
| `"User 'John Smith' is not valid"`, `"User 'Bob' is not valid"` | `"User * is not valid"` |
94+
| `"Error 1 for user 'John Smith' in region us-east"`, `"Error 2 for user 'Bob' in region eu-west"` | `"Error * for user * in region *"` |
95+
| `"Could not create user with ID <guid1>"`, `"...with ID <guid2>"` | `"Could not create user with ID *"` |
96+
| `"OrderProcessor"`, `"PaymentHandler"` | Kept separate (no common tokens) |
97+
98+
### Performance
99+
100+
- LCS computation: O(m × n) per comparison via DP
101+
- Total: O(K × N × L²) where K = templates (~20), N = categories (~1000), L = avg tokens (~10)
102+
- Completes in microseconds for typical DLQ data
103+
104+
## Merge + Expand Pattern
105+
106+
The key architectural insight is that merging is **only a display/grouping concern**. The downstream message filtering (`SnapshotForCategories`) still uses exact-match `DlqCategoryKey` values.
107+
108+
The flow:
109+
110+
1. **Merge**: Cluster categories by LCS similarity, sum counts, and build a `MergeMap` (merged key → set of original keys)
111+
2. **Display**: Show merged categories to the user in the interactive table
112+
3. **Select**: User picks merged categories by index number
113+
4. **Expand**: Convert selected merged keys back to all original keys via `ExpandKeys`
114+
5. **Filter**: Pass expanded original keys to the existing `SnapshotForCategories` method
115+
116+
```
117+
Original categories Merged categories User selects
118+
┌─────────────────────┐ ┌────────────────────┐ merged #1
119+
│ Error with ID abc │────▸│ Error with ID * │─────────┐
120+
│ Error with ID def │────▸│ (count: 3) │ │
121+
│ Error with ID ghi │────▸│ │ │
122+
└─────────────────────┘ └────────────────────┘ │
123+
124+
ExpandKeys() Filter with
125+
┌────────────────────┐ original keys
126+
│ {abc, def, ghi} │───▸ exact match
127+
└────────────────────┘
128+
```
129+
130+
## Files
131+
132+
- `CategoryMerger.cs` — Static utility with LCS-based `Merge` algorithm
133+
- `CategoryMergeResult.cs` — Result record with `ExpandKeys` method
134+
- `StreamDlqCategories.cs` — Threading the `MergeSimilar` flag through command → snapshot
135+
- `ResubmitDlqCommandHandler.cs` — Wiring up key expansion in the interactive flow
136+
- `ResubmitDlqCliCommand.cs` — The `--merge-similar` CLI option
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
namespace ServiceBusToolset.Application.DeadLetters.Common;
2+
3+
public sealed record CategoryMergeResult(IReadOnlyList<DlqCategory> MergedCategories,
4+
IReadOnlyDictionary<DlqCategoryKey, IReadOnlySet<DlqCategoryKey>> MergeMap)
5+
{
6+
/// <summary>
7+
/// Expands merged category keys back to all original keys they represent,
8+
/// enabling exact-match filtering downstream.
9+
/// </summary>
10+
public HashSet<DlqCategoryKey> ExpandKeys(IReadOnlySet<DlqCategoryKey> mergedKeys)
11+
{
12+
var expanded = new HashSet<DlqCategoryKey>();
13+
14+
foreach (var mergedKey in mergedKeys)
15+
{
16+
if (MergeMap.TryGetValue(mergedKey, out var originals))
17+
{
18+
foreach (var original in originals)
19+
{
20+
expanded.Add(original);
21+
}
22+
}
23+
else
24+
{
25+
expanded.Add(mergedKey);
26+
}
27+
}
28+
29+
return expanded;
30+
}
31+
}

0 commit comments

Comments
 (0)