Monitoring & the Anyray console
The Anyray console is the admin's window into the system. It's a first-party React/Vite
app (observability/ui/) served on :3000 behind a single admin-key gate
(ANYRAY_ADMIN_TOKEN) — one key, no per-page login. It's where you see, on your own
traffic, what you're spending, who's spending it, and how requests flow — driven by
evidence you can look at, not a promise.
:::info Status The console and the content-free spend/trace stores behind it are implemented. The Shadow-Mode would-be-savings views (projected savings, would-be decision mix, holdback comparison) are roadmap and marked below. :::
What it shows
The console pages, all behind the single admin key:
- Dashboard (home) — at-a-glance KPI stat cards for Traces, Cost, Tokens, and Savings, each with a 30-day sparkline, plus two roll-up panels (see below). Every number is computed over the content-free stores — metadata only.
- Traces — a metadata-only list of requests. For rare drill-down, a trace
deep-links into the internal Langfuse trace-detail view. Content is shown only when the
effective content mode is
plaintext. - Sessions — requests grouped into sessions for context.
- Users — per-user spend / attribution plus each user's monthly token-cap usage (see Operate → spend governance).
- Optimizer — the optimization strategy settings (toggle, reorder, parametrize) and per-user / per-team / per-endpoint / per-model targeting rules, runtime-mutable and audit-logged.
- Providers — manage server-held provider API keys at runtime; audit-logged by provider slug, never key values.
- Routing — the gateway's default routing strategy
(
single/loadbalance/fallback/conditional) with per-target retry/fallback. - Connect — on-ramp instructions for the
anyray-connectCLI, which points local coding tools at the gateway. - Pricing — the admin-editable model pricing table (USD per token) that drives cost attribution in the spend store.
- Privacy — the org-wide content mode
(
encrypted/off/plaintext), togglable at runtime and audit-logged.
Dashboard
The home page is the at-a-glance roll-up over the content-free stores:
- KPI cards — Traces, Cost, Tokens, and Savings, each with a 30-day sparkline so you see the trend, not just the current number.
- Savings panel — tokens-saved-per-day bars and top contributors by optimizer strategy, so you can see which lever is earning its keep (see Billing → optimization savings).
- Performance panel — P50 / P95 latency trend and the slowest endpoints.
All aggregations are metadata only — never prompt or response content.
Spend & users
- Spend over time, broken down by team and user, and by model and provider, sourced from
the content-free spend store (
GET /admin/spend): who/team, model, provider, tokens, cost, latency, status — never content. The Users page adds each user's monthly token-cap usage. - Track it as cost per correct answer, not just raw dollars.
Would-be decisions (Shadow Mode — roadmap)
- The would-be decision mix (what share would downgrade / hit cache / pass through) and per-strategy projected contribution, so you can see which lever earns what before enabling it.
:::warning Status: roadmap Shadow Mode and its would-be-savings / would-be-diff views — projected savings vs. realized savings, the would-be decision mix, and holdback comparison — are planned, not yet built. Today the console shows realized spend and traces on live traffic. :::
Privacy by design
The console is built so monitoring doesn't become surveillance — consistent with how Anyray treats your data:
- Runs entirely on-prem. The console and all its data live in your environment; Anyray is fully self-hosted and nothing leaves it.
- Metadata-first. Spend and traces need no raw content — that's the default view. Traces are metadata-only by default.
- Content encrypted at rest. When content is stored (mode
encrypted, the default), it's AES-256-GCM encrypted viaANYRAY_CONTENT_KEY; humans see ciphertext. Decrypting is an offline, authorized-audit action — never exposed in a UI. - Aggregated by team, not per-employee. It's a cost tool, not a way to watch individuals.
See Security and The data boundary.
How it's built (data model)
The console is a read-only view over stores the gateway already populates:
- the content-free spend store — who/team, model, provider, tokens, cost, latency,
status per request (powers Spend); exposed at
GET /admin/spend. - the trace backend — a self-hosted Langfuse (trace store only); the gateway exports
per-request traces to it internally
(
ANYRAY_OBSERVABILITY_BASEURL/ANYRAY_OBSERVABILITY_PUBLIC_KEY/ANYRAY_OBSERVABILITY_SECRET_KEY). Traces are metadata-only by default; the Anyray console is the real UI, with deep-links into Langfuse trace detail for rare drill-down.
Everything is served on-prem behind the single admin key. Read-only; metadata-only by default.
Bring your own observability (lighter note / roadmap). Because the gateway exports traces internally, plugging Anyray into a different self-hosted observability stack you already run is a natural extension — traces and cost flowing into your existing tooling rather than forcing a new one. Everything stays on-prem; nothing is sent to a third-party SaaS. (First-class OpenTelemetry export to arbitrary backends is roadmap.)
How you use it
- Open http://<host>:3000 and sign in with your
ANYRAY_ADMIN_TOKEN. - Scan the Dashboard — KPI sparklines, the Savings panel, and the Performance panel — for the at-a-glance picture.
- Watch Spend and Users by team and user — where is the money going, and is anyone nearing a token cap?
- Use Traces and Sessions to understand specific requests and optimization decisions.
- Tune the pipeline and targeting rules on the Optimizer page, server-held keys on Providers, routing on Routing, model prices on Pricing, and the content mode on Privacy — all runtime-mutable and audit-logged.
- Keep watching cost-per-correct-answer in production.
(The roadmap flow — run in Shadow Mode, review would-be savings, then compare the optimized cohort against an always-on holdback — is described in Proof, not promises.)
See also
- Proof, not promises — the (roadmap) trust mechanism
- Operate — day-to-day operation
- The data boundary — why none of this leaves your environment