Using the SDKs

You use your LLM SDK unchanged. You point it at the Anyray gateway (:8787) by setting one environment variable. The gateway is multi-provider, so the same gateway serves every SDK.

The only thing that changes: the base URL

SDK / runtime	Env var you set
OpenAI SDK	`OPENAI_BASE_URL` = the Anyray gateway (`http://<anyray-host>:8787`)
Anthropic SDK	`ANTHROPIC_BASE_URL` = the Anyray gateway
Bedrock / Vertex / Azure OpenAI	point at the gateway — it speaks these providers natively
Claude Code, agents, jobs	inherit the same env from their environment

You set this explicitly — in the pod/Deployment spec, your config management, a shell profile, or a CI secret. There is no org CA to trust, no TLS-MITM, and no HTTPS_PROXY. Your SDK is just pointed at a different URL. (Zero-touch admission-webhook injection is roadmap.)

:::tip On a laptop? Let anyray-connect do it For local coding tools (Claude Code, Cursor, Windsurf) and your shell/SDK env, the anyray-connect CLI writes these base URLs and a placeholder key for you in one command — npx anyray-connect --gateway <gateway>. It's idempotent and reversible. See Connect your tools. The manual env vars below remain the fallback (and the right path for pods/CI). :::

What your code looks like

Nothing special — the SDK reads the base URL from the environment:

# OpenAI SDK — unchanged. Point OPENAI_BASE_URL at the Anyray gateway.
from openai import OpenAI
client = OpenAI()                       # picks up OPENAI_BASE_URL
resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarize this changelog…"}],
)

# Anthropic SDK — unchanged. Point ANTHROPIC_BASE_URL at the Anyray gateway.
from anthropic import Anthropic
client = Anthropic()                    # picks up ANTHROPIC_BASE_URL
msg = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain this stack trace…"}],
)

Optional: provider selection and attribution headers

The gateway accepts a couple of optional headers:

x-anyray-provider — which provider to route to (e.g. openai, anthropic, bedrock, google-vertex-ai, azure-openai) when it isn't implied by the SDK/base URL.
x-anyray-metadata — content-free attribution, e.g. {"user","team","session"}, used by the spend store to break spend down by who/team. It carries no prompt content.

curl http://<anyray-host>:8787/v1/chat/completions \
  -H "Authorization: Bearer $PROVIDER_KEY" \
  -H "x-anyray-provider: openai" \
  -H 'x-anyray-metadata: {"user":"alice","team":"platform"}' \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"hi"}]}'

Streaming, tools, and response shape

The Anyray gateway owns provider transport, so streaming streams, tool/function calls work, and the response shape matches your SDK. The optimizer hook only transforms the request (params, prompt, tools) or serves a cache hit.

Running locally

Bring up the whole stack from the repo root, then point your SDK at the gateway:

cp .env.example .env && docker compose up -d   # gateway on :8787
export OPENAI_BASE_URL=http://localhost:8787

See Gateways → Anyray's own gateway. (An alternative LiteLLM-based stack is documented in Gateways → LiteLLM.)

Opting a request out

If your org wants certain workloads to always hit frontier (e.g. evals), that's a configuration choice the admin makes — see Org Admin → Configure. As a developer you don't need to special-case anything; the fail-safe already protects reasoning-heavy work.

The only thing that changes: the base URL​

What your code looks like​

Optional: provider selection and attribution headers​

Streaming, tools, and response shape​

Running locally​

Opting a request out​