Using the SDKs
You use your LLM SDK unchanged. You point it at the Anyray gateway (:8787) by setting one environment variable. The gateway is multi-provider, so the same gateway serves every SDK.
The only thing that changes: the base URL
| SDK / runtime | Env var you set |
|---|---|
| OpenAI SDK | OPENAI_BASE_URL = the Anyray gateway (http://<anyray-host>:8787) |
| Anthropic SDK | ANTHROPIC_BASE_URL = the Anyray gateway |
| Bedrock / Vertex / Azure OpenAI | point at the gateway — it speaks these providers natively |
| Claude Code, agents, jobs | inherit the same env from their environment |
You set this explicitly — in the pod/Deployment spec, your config management, a shell
profile, or a CI secret. There is no org CA to trust, no TLS-MITM, and no HTTPS_PROXY.
Your SDK is just pointed at a different URL. (Zero-touch admission-webhook injection is
roadmap.)
:::tip On a laptop? Let anyray-connect do it
For local coding tools (Claude Code, Cursor, Windsurf) and your shell/SDK env, the
anyray-connect CLI writes these base URLs and a placeholder key for you in one command —
npx anyray-connect --gateway <gateway>. It's idempotent and reversible. See
Connect your tools. The manual env vars below remain the fallback (and the
right path for pods/CI).
:::
What your code looks like
Nothing special — the SDK reads the base URL from the environment:
# OpenAI SDK — unchanged. Point OPENAI_BASE_URL at the Anyray gateway.
from openai import OpenAI
client = OpenAI() # picks up OPENAI_BASE_URL
resp = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Summarize this changelog…"}],
)
# Anthropic SDK — unchanged. Point ANTHROPIC_BASE_URL at the Anyray gateway.
from anthropic import Anthropic
client = Anthropic() # picks up ANTHROPIC_BASE_URL
msg = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain this stack trace…"}],
)
Optional: provider selection and attribution headers
The gateway accepts a couple of optional headers:
x-anyray-provider— which provider to route to (e.g.openai,anthropic,bedrock,google-vertex-ai,azure-openai) when it isn't implied by the SDK/base URL.x-anyray-metadata— content-free attribution, e.g.{"user","team","session"}, used by the spend store to break spend down by who/team. It carries no prompt content.
curl http://<anyray-host>:8787/v1/chat/completions \
-H "Authorization: Bearer $PROVIDER_KEY" \
-H "x-anyray-provider: openai" \
-H 'x-anyray-metadata: {"user":"alice","team":"platform"}' \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"hi"}]}'
Streaming, tools, and response shape
The Anyray gateway owns provider transport, so streaming streams, tool/function calls work, and the response shape matches your SDK. The optimizer hook only transforms the request (params, prompt, tools) or serves a cache hit.
Running locally
Bring up the whole stack from the repo root, then point your SDK at the gateway:
cp .env.example .env && docker compose up -d # gateway on :8787
export OPENAI_BASE_URL=http://localhost:8787
See Gateways → Anyray's own gateway. (An alternative LiteLLM-based stack is documented in Gateways → LiteLLM.)
Opting a request out
If your org wants certain workloads to always hit frontier (e.g. evals), that's a configuration choice the admin makes — see Org Admin → Configure. As a developer you don't need to special-case anything; the fail-safe already protects reasoning-heavy work.