[SDK] Compaction does not trigger for BYOK Anthropic — model token budget not resolved

## Summary

Two related problems with infinite sessions compaction and Anthropic models:

1. **BYOK providers**: Compaction **never triggers** — the CLI can't resolve `max_context_window_tokens` for BYOK providers, so compaction is silently skipped.
2. **Catalog models**: Token budget is **hardcoded to 128K** for all models (`test/harness/replayingCapiProxy.ts:1088`), incorrect for Anthropic Claude models which have 200K context windows.

## Repro (confirmed 2026-04-13)

SDK `v0.2.2`, model: `claude-sonnet-4-20250514` via BYOK (CAPI relay)

```
Thresholds: 5% background, 30% blocking
Expected trigger: ~6,400 tokens (5% of 128K) or ~10,000 (5% of 200K)

  Message 1:  input=62       Message 6:  input=6,354
  Message 2:  input=1,047    Message 7:  input=7,817
  Message 3:  input=2,198    Message 8:  input=9,361
  Message 4:  input=3,516    Message 9:  input=10,870
  Message 5:  input=4,847    Message 10: input=12,390

  Total input: 58,462 tokens
  Compaction events: 0  ← NEVER triggered
```

No `session.compaction_start` or `session.compaction_complete` events received, even at 12K+ input tokens per call (far exceeding any 5% threshold).

## SDK code references

### Where the budget is hardcoded (CLI layer)

| File | Line | What |
|------|------|------|
| `test/harness/replayingCapiProxy.ts` | 1088 | `max_context_window_tokens: 128000` for ALL models |

### SDK type flow (passes through, no provider logic)

| File | Line | What |
|------|------|------|
| `python/copilot/generated/rpc.py` | 182 | `max_context_window_tokens: float` |
| `python/copilot/client.py` | 332 | `max_context_window_tokens: int | None` |
| `go/types.go` | 823 | `MaxContextWindowTokens int` |
| `nodejs/src/types.ts` | 1577 | `max_context_window_tokens: number` |

### Compaction threshold config

| File | Line | What |
|------|------|------|
| `go/types.go` | 463 | `BackgroundCompactionThreshold *float64` |
| `python/copilot/session.py` | 785 | `background_compaction_threshold` field |

## Impact on Dracarys

- Long Anthropic sessions will eventually hit 200K context limit and stall — compaction should prevent this
- For catalog-routed models, compaction wastes ~58K tokens of usable context (triggers at 80% of 128K instead of 80% of 200K)
- Our PR #413 (infinite sessions support) is blocked on this for Anthropic providers

## Suggested fix

1. **BYOK**: Accept `max_context_window_tokens` on `ProviderConfig`. Use provider-type defaults:
   - `anthropic` → 200,000
   - `openai` → 128,000
2. **Catalog**: Populate from actual model metadata (branch `mackinnonbuck/byok-provider-token-limits` partially addresses this)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SDK] Compaction does not trigger for BYOK Anthropic — model token budget not resolved #1072

Summary

Repro (confirmed 2026-04-13)

SDK code references

Where the budget is hardcoded (CLI layer)

SDK type flow (passes through, no provider logic)

Compaction threshold config

Impact on Dracarys

Suggested fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

File	Line	What
`python/copilot/generated/rpc.py`	182	`max_context_window_tokens: float`
`python/copilot/client.py`	332	`max_context_window_tokens: int
`go/types.go`	823	`MaxContextWindowTokens int`
`nodejs/src/types.ts`	1577	`max_context_window_tokens: number`

File	Line	What
`go/types.go`	463	`BackgroundCompactionThreshold *float64`
`python/copilot/session.py`	785	`background_compaction_threshold` field

[SDK] Compaction does not trigger for BYOK Anthropic — model token budget not resolved #1072

Description

Summary

Repro (confirmed 2026-04-13)

SDK code references

Where the budget is hardcoded (CLI layer)

SDK type flow (passes through, no provider logic)

Compaction threshold config

Impact on Dracarys

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions