feat(infra): LiteLLM unified gateway for multi-provider LLM routing#20
Closed
moralespanitz wants to merge 1 commit intofeat/first-mention-tll-on-mainfrom
Closed
feat(infra): LiteLLM unified gateway for multi-provider LLM routing#20moralespanitz wants to merge 1 commit intofeat/first-mention-tll-on-mainfrom
moralespanitz wants to merge 1 commit intofeat/first-mention-tll-on-mainfrom
Conversation
Adds an opt-in LiteLLM proxy sidecar under docker/litellm/ so AtomicMemory can route LLM calls to Anthropic, OpenAI, Microsoft Foundry / Azure, AWS Bedrock, or Google Gemini through a single OpenAI-compatible endpoint. Provider swap is config-only — no new code path in src/services/llm.ts. Why - Today llm.ts already supports `LLM_PROVIDER=openai-compatible` with `LLM_API_URL` + `LLM_API_KEY`. Pointing that lane at a LiteLLM proxy reuses the existing seam and keeps cost-telemetry, AUDN-timeout, and retry behavior unchanged. - A single config.yaml replaces per-provider client wiring across the research harness and any future deployment, so we add a provider by appending one model_list entry instead of touching TypeScript. What ships - docker/litellm/litellm-config.yaml — model_list entries for Anthropic (Haiku 4.5, Sonnet 4.6), OpenAI (gpt-5-chat, gpt-4o-mini), Foundry (gpt-5-chat via azure/), Bedrock (Claude Sonnet), Gemini (1.5-pro). Provider keys resolved via os.environ/VAR_NAME at request time. - docker/litellm/docker-compose.litellm.yml — pinned compose service on port 4000 with explicit `name: atomicmemory-litellm` so the project never collides with another `litellm/`-named compose stack. - docker/litellm/README.md — quick start, env-var table per provider, cost-telemetry caveats. - docker/litellm/.env.example — credential template. No src/ changes; the existing openai-compatible lane already accepts LLM_API_URL + LLM_API_KEY (config.ts → llm.ts → OpenAICompatibleLLM). Smoke - Anthropic Haiku 4.5 via the proxy: 200 OK, 1.95s, 23 in / 14 out tokens, ~$0.00009. Output coherent. - Foundry / Bedrock / Gemini / OpenAI: model aliases load cleanly at proxy startup (`Set models:` lists all 7); no live calls without credentials.
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a self-contained LiteLLM proxy under
docker/litellm/so the AtomicMemory stack can route to Anthropic, OpenAI, Foundry/Azure, Bedrock, or Gemini via a single OpenAI-compatible HTTP endpoint atlocalhost:4000.The integration is config-only on the AtomicMemory side — core's existing
LLM_PROVIDER=openai-compatiblelane already supportsOPENAI_BASE_URLoverrides. No code changes required insrc/services/llm.ts.What's added
docker/litellm/litellm-config.yaml— provider routes for 5 providers (Anthropic, OpenAI, Foundry, Bedrock, Gemini), env-var-driven keys, model aliases likeanthropic/claude-haiku-4-5,azure/gpt-5-chat,gemini/gemini-1.5-pro.docker/litellm/docker-compose.litellm.yml— runsghcr.io/berriai/litellm:main-stablemounted with the config; exposes port 4000.docker/litellm/.env.example— template for the four credential env vars (ANTHROPIC, OPENAI, GEMINI, AZURE — Bedrock placeholder).docker/litellm/README.md— quick-start: env vars,docker compose up, smoke test.Behavior changes for callers
None — purely additive infrastructure. The dev server's existing direct-Anthropic path keeps working. To opt into LiteLLM routing:
Test plan
docker compose -f docker/litellm/docker-compose.litellm.yml up -dcurl http://localhost:4000/health/livenessreturns 200bash atomicmemory-research/memory-research/litellm-setup/smoke-test.sh