Gateways overview
Anyray's optimizer is gateway-neutral — it reaches your traffic through a
gateway on the request path via an adapter that speaks
Optimizer Protocol v1 (/v1/optimize,
/v1/optimize-response, /v1/cache). There are two ways the gateway can show up:
- Default (implemented) — Anyray's own gateway.
Anyray's own gateway is the running default. It's a Portkey
fork, so it's multi-provider (openai, anthropic, bedrock, google-vertex-ai,
azure-openai, …), OpenAI-compatible, on :8787. Its optimizer adapter is built in
(
gateway/src/services/optimizer.ts). - Roadmap — your existing gateway via an adapter. Already run LiteLLM, Kong, Portkey,
Cloudflare, or Envoy? The plan is a thin Anyray adapter for it. We depend on the
host's open-source (MIT) core — we never fork it for these adapter targets. LiteLLM
has a reference adapter in
optimizer/adapters/; Kong, Envoy, Cloudflare, and Portkey are stubs.
org's workers ──▶ GATEWAY ──▶ providers
(transport)
│ Optimizer Protocol v1 (/v1/optimize …)
▼
ANYRAY OPTIMIZER
▲
① gateway/ Anyray's own multi-provider proxy (DEFAULT — implemented)
② optimizer/adapters/<host>/ thin adapter (LiteLLM reference; rest roadmap)
What a host gateway keeps doing
For an adapter target, the host gateway remains responsible for providers, wire translation, Bedrock signing, streaming, API keys, and built-in caching. The Anyray adapter only maps the host's hook to the optimizer and applies the decision — we never fork the host's core. (Anyray's own gateway owns all of this itself.)
Target gateways & status
| Gateway | Adapter hook | Routing/transform | canShortCircuit (serve cache hit from hook) | Status |
|---|---|---|---|---|
| Anyray's own gateway | in-repo gateway/src/services/optimizer.ts (TS fetch) | multi-provider | ✅ (owns the response path) | implemented — default |
| LiteLLM | async_pre_call_hook + async_post_call_success_hook | request transform | ➖ via host's built-in cache | reference adapter (optimizer/adapters/); packaging roadmap |
| Kong AI Gateway | custom plugin (Lua/Go) | request transform | ✅ (plugin can return a response) | roadmap stub |
| Envoy | ext_proc (gRPC) | request transform | ✅ (processor can return a response) | roadmap stub |
| Portkey | hooks / guardrails | request transform | ~ (to verify) | roadmap stub (Anyray's gateway is forked from Portkey) |
| Cloudflare AI Gateway | managed config | ❌ (no rewrite) | partial (Cloudflare cache) | roadmap stub |
:::info Status
The only implemented gateway today is Anyray's own multi-provider gateway. LiteLLM
has a runnable reference adapter in
optimizer/adapters/;
Kong, Envoy, Cloudflare, and Portkey are roadmap stubs.
:::
A note on capability differences
The capability that matters across adapters is canShortCircuit: some hosts can serve a
cache hit directly from the hook (Anyray's own gateway, Kong, Envoy), so a cacheHit from
/v1/optimize is returned without calling the provider. LiteLLM's pre-call hook cannot
return a cached success, so it can only transform the request and delegates cache hits to
its built-in cache. See the protocol.
Adding a new gateway
- Read the Optimizer Protocol v1 and the spec
(
optimizer/PROTOCOL.md). - Create the adapter under
optimizer/adapters/<host>/(polyglot — Python, Lua/Go, …). - Wire the host's pre-call hook →
POST {OPTIMIZER}/v1/optimize; apply the request transform, or (ifcanShortCircuit) serve thecacheHit. Fail-open: on any error, pass through unchanged. - Wire the host's post-call hook →
POST {OPTIMIZER}/v1/optimize-response(andPOST {OPTIMIZER}/v1/cachefor write-back). Spend is recorded by the gateway's own content-free spend store — there is no separatemeter()call.