Skip to main content

Developers overview

If you're an employee who uses AI at work — running Claude Code or another coding assistant, building agents, or writing apps and jobs that call LLMs — this section answers one question: what changes for me when my org runs Anyray? Short version: your tools keep working, nobody's watching what you do, and you help your org afford more AI, not less.

The short answer: almost nothing. You don't change code and you don't change SDKs — only the base URL changes (it points at your org's Anyray gateway). And you keep getting answers good enough for the task, because the optimizer fails open whenever it's down or slow.

:::tip Anyray is on your side — not surveillance, not a throttle

  • It never shows your prompts or responses. Content is encrypted at rest by default and never displayed in any UI; only content-free metadata (who/team, model, provider, tokens, cost, latency) is used for spend attribution. It's all self-hosted — nothing leaves your org. Nobody is reading your conversations or scoring you individually.
  • It won't slow you down. The optimizer hook runs in a tight time budget (800 ms) and fails open — if anything's off, it gets out of the way and your request is forwarded unchanged.
  • It won't give you a worse answer. The gateway owns model routing; when it's unsure (or the optimizer is unreachable) your request goes to the requested/frontier model, unchanged. (Automated cheap/frontier routing is roadmap.)
  • The upside is yours too. When AI is affordable, orgs expand access instead of restricting it. Anyray is what keeps your tools sustainable. :::

What you keep doing

  • Call the OpenAI / Anthropic SDK (or run Claude Code, agents, batch jobs) exactly as before.
  • Use your normal request shapes, models, tools, and streaming.

What changes behind the scenes

  • Your SDK's base URL is pointed at your org's Anyray gateway (:8787) — typically by setting OPENAI_BASE_URL / ANTHROPIC_BASE_URL, or in one step with the anyray-connect CLI for local tools. The gateway is multi-provider (OpenAI, Anthropic, Bedrock, Vertex, Azure OpenAI), so the same gateway handles whichever SDK you use.
  • On the way through, the gateway calls the optimizer's pre-call hook (/v1/optimize); your request may have wasteful parameters trimmed, its prompt compressed, unused tools pruned, or be served from semantic cache (default off). The gateway then calls the provider and runs the post-call hook.
  • Model routing is the gateway's job — it picks the provider/model and handles fallbacks. (Automated classify-then-downgrade routing is roadmap.) If the optimizer is unreachable or slow, the hook fails open and your request is forwarded unchanged.
your code ──▶ Anyray gateway (:8787) ──▶ provider
(base URL) optimizer hook (/v1/optimize),
fail-open when it's down

The guarantee you care about

The worst case for you is "this request was forwarded unchanged." If the optimizer is down or unsure, you simply get the model you asked for — never a degraded result you didn't opt into.

  • Connect your tools — point Claude Code, Cursor, Windsurf, and your shell/SDK env at the gateway in one command with anyray-connect.
  • Using the SDKs — what the zero-touch redirect looks like in practice.
  • How requests are optimized — the decision your request goes through.
  • FAQ — quick answers (streaming, tools, latency, opting out).