Gateways overview

Anyray's optimizer is gateway-neutral — it reaches your traffic through a gateway on the request path via an adapter that speaks Optimizer Protocol v1 (/v1/optimize, /v1/optimize-response, /v1/cache). There are two ways the gateway can show up:

Default (implemented) — Anyray's own gateway. Anyray's own gateway is the running default. It's a Portkey fork, so it's multi-provider (openai, anthropic, bedrock, google-vertex-ai, azure-openai, …), OpenAI-compatible, on :8787. Its optimizer adapter is built in (gateway/src/services/optimizer.ts).
Roadmap — your existing gateway via an adapter. Already run LiteLLM, Kong, Portkey, Cloudflare, or Envoy? The plan is a thin Anyray adapter for it. We depend on the host's open-source (MIT) core — we never fork it for these adapter targets. LiteLLM has a reference adapter in optimizer/adapters/; Kong, Envoy, Cloudflare, and Portkey are stubs.

   org's workers  ──▶  GATEWAY  ──▶  providers
                       (transport)
                           │ Optimizer Protocol v1 (/v1/optimize …)
                           ▼
                       ANYRAY OPTIMIZER
                           ▲
           ① gateway/                   Anyray's own multi-provider proxy  (DEFAULT — implemented)
           ② optimizer/adapters/<host>/ thin adapter                       (LiteLLM reference; rest roadmap)

What a host gateway keeps doing

For an adapter target, the host gateway remains responsible for providers, wire translation, Bedrock signing, streaming, API keys, and built-in caching. The Anyray adapter only maps the host's hook to the optimizer and applies the decision — we never fork the host's core. (Anyray's own gateway owns all of this itself.)

Target gateways & status

Gateway	Adapter hook	Routing/transform	`canShortCircuit` (serve cache hit from hook)	Status
Anyray's own gateway	in-repo `gateway/src/services/optimizer.ts` (TS fetch)	multi-provider	✅ (owns the response path)	implemented — default
LiteLLM	`async_pre_call_hook` + `async_post_call_success_hook`	request transform	➖ via host's built-in cache	reference adapter (`optimizer/adapters/`); packaging roadmap
Kong AI Gateway	custom plugin (Lua/Go)	request transform	✅ (plugin can return a response)	roadmap stub
Envoy	`ext_proc` (gRPC)	request transform	✅ (processor can return a response)	roadmap stub
Portkey	hooks / guardrails	request transform	~ (to verify)	roadmap stub (Anyray's gateway is forked from Portkey)
Cloudflare AI Gateway	managed config	❌ (no rewrite)	partial (Cloudflare cache)	roadmap stub

:::info Status The only implemented gateway today is Anyray's own multi-provider gateway. LiteLLM has a runnable reference adapter in optimizer/adapters/; Kong, Envoy, Cloudflare, and Portkey are roadmap stubs. :::

A note on capability differences

The capability that matters across adapters is canShortCircuit: some hosts can serve a cache hit directly from the hook (Anyray's own gateway, Kong, Envoy), so a cacheHit from /v1/optimize is returned without calling the provider. LiteLLM's pre-call hook cannot return a cached success, so it can only transform the request and delegates cache hits to its built-in cache. See the protocol.

Adding a new gateway

Read the Optimizer Protocol v1 and the spec (optimizer/PROTOCOL.md).
Create the adapter under optimizer/adapters/<host>/ (polyglot — Python, Lua/Go, …).
Wire the host's pre-call hook → POST {OPTIMIZER}/v1/optimize; apply the request transform, or (if canShortCircuit) serve the cacheHit. Fail-open: on any error, pass through unchanged.
Wire the host's post-call hook → POST {OPTIMIZER}/v1/optimize-response (and POST {OPTIMIZER}/v1/cache for write-back). Spend is recorded by the gateway's own content-free spend store — there is no separate meter() call.

What a host gateway keeps doing​

Target gateways & status​

A note on capability differences​

Adding a new gateway​

What a host gateway keeps doing

Target gateways & status

A note on capability differences

Adding a new gateway