Skip to main content

Gateways overview

Anyray's optimizer is gateway-neutral — it reaches your traffic through a gateway on the request path via an adapter that speaks Optimizer Protocol v1 (/v1/optimize, /v1/optimize-response, /v1/cache). There are two ways the gateway can show up:

  1. Default (implemented) — Anyray's own gateway. Anyray's own gateway is the running default. It's a Portkey fork, so it's multi-provider (openai, anthropic, bedrock, google-vertex-ai, azure-openai, …), OpenAI-compatible, on :8787. Its optimizer adapter is built in (gateway/src/services/optimizer.ts).
  2. Roadmap — your existing gateway via an adapter. Already run LiteLLM, Kong, Portkey, Cloudflare, or Envoy? The plan is a thin Anyray adapter for it. We depend on the host's open-source (MIT) core — we never fork it for these adapter targets. LiteLLM has a reference adapter in optimizer/adapters/; Kong, Envoy, Cloudflare, and Portkey are stubs.
org's workers ──▶ GATEWAY ──▶ providers
(transport)
│ Optimizer Protocol v1 (/v1/optimize …)

ANYRAY OPTIMIZER

① gateway/ Anyray's own multi-provider proxy (DEFAULT — implemented)
② optimizer/adapters/<host>/ thin adapter (LiteLLM reference; rest roadmap)

What a host gateway keeps doing

For an adapter target, the host gateway remains responsible for providers, wire translation, Bedrock signing, streaming, API keys, and built-in caching. The Anyray adapter only maps the host's hook to the optimizer and applies the decision — we never fork the host's core. (Anyray's own gateway owns all of this itself.)

Target gateways & status

GatewayAdapter hookRouting/transformcanShortCircuit (serve cache hit from hook)Status
Anyray's own gatewayin-repo gateway/src/services/optimizer.ts (TS fetch)multi-provider✅ (owns the response path)implemented — default
LiteLLMasync_pre_call_hook + async_post_call_success_hookrequest transform➖ via host's built-in cachereference adapter (optimizer/adapters/); packaging roadmap
Kong AI Gatewaycustom plugin (Lua/Go)request transform✅ (plugin can return a response)roadmap stub
Envoyext_proc (gRPC)request transform✅ (processor can return a response)roadmap stub
Portkeyhooks / guardrailsrequest transform~ (to verify)roadmap stub (Anyray's gateway is forked from Portkey)
Cloudflare AI Gatewaymanaged config❌ (no rewrite)partial (Cloudflare cache)roadmap stub

:::info Status The only implemented gateway today is Anyray's own multi-provider gateway. LiteLLM has a runnable reference adapter in optimizer/adapters/; Kong, Envoy, Cloudflare, and Portkey are roadmap stubs. :::

A note on capability differences

The capability that matters across adapters is canShortCircuit: some hosts can serve a cache hit directly from the hook (Anyray's own gateway, Kong, Envoy), so a cacheHit from /v1/optimize is returned without calling the provider. LiteLLM's pre-call hook cannot return a cached success, so it can only transform the request and delegates cache hits to its built-in cache. See the protocol.

Adding a new gateway

  1. Read the Optimizer Protocol v1 and the spec (optimizer/PROTOCOL.md).
  2. Create the adapter under optimizer/adapters/<host>/ (polyglot — Python, Lua/Go, …).
  3. Wire the host's pre-call hook → POST {OPTIMIZER}/v1/optimize; apply the request transform, or (if canShortCircuit) serve the cacheHit. Fail-open: on any error, pass through unchanged.
  4. Wire the host's post-call hook → POST {OPTIMIZER}/v1/optimize-response (and POST {OPTIMIZER}/v1/cache for write-back). Spend is recorded by the gateway's own content-free spend store — there is no separate meter() call.