Gateway API

The companion reference to Gateway. This page documents every public endpoint the gateway exposes, with exact auth, request notes, and error semantics. For the why โ€” what governance adds to a request โ€” read the concept page first; this page is for looking things up.

Base URL: https://gateway.unyform.ai

Authentication at a glance

The gateway uses two distinct auth schemes depending on the endpoint:

SchemeHeaderUsed byValidated against
BYOK caller keyAuthorization: Bearer <key> or x-api-key: <key>/gw/{gateway_id}/... passthroughForwarded verbatim to the upstream provider โ€” never checked by the gateway
Gateway key (uny_gw_โ€ฆ)Authorization: Bearer uny_gw_โ€ฆ/v1/chat/completions, /v1/cc/*SHA-256 hash lookup in gateway_api_keys, expiry checked

Note

On passthrough endpoints the gateway is a governed proxy, not an auth server: it identifies the target gateway from the URL path and forwards your provider credential upstream. It never holds or validates your provider key. On single-tenant and CC endpoints the uny_gw_โ€ฆ key is the gateway's own credential and resolves to exactly one gateway_id.

Error shape

Every gateway-originated error returns the same JSON envelope:

{
  "error": {
    "message": "Gateway is not active",
    "type": "service_unavailable"
  }
}

Policy blocks add a violations array under error:

{
  "error": {
    "message": "Request blocked by policy",
    "type": "policy_violation",
    "violations": [
      { "policy": "Block AWS keys", "message": "...", "severity": "Critical" }
    ]
  }
}

Warning

On Anthropic-native passthrough, errors that come from Anthropic (not the gateway) are returned with their original status code and body unchanged โ€” the envelope above only wraps errors the gateway raises itself. Check the request-id header to correlate upstream failures.

Error codes

StatustypeWhen
400invalid_request_errorMalformed JSON body, no provider available for the requested model, or Opus + Max OAuth combo (see below)
401authentication_errorNo Authorization/x-api-key on passthrough, or missing/invalid uny_gw_โ€ฆ bearer on single-tenant/CC endpoints
403forbidden / policy_violationProvider not enabled or passthrough disabled for the provider; or request/response blocked by a Strict policy
404not_foundGateway ID not found; or upstream model not found (forwarded from Anthropic)
502api_errorUpstream provider request failed
503service_unavailableGateway not active, database not configured, or encryption key not configured
504api_errorUpstream request timed out

BYOK passthrough

Route your own provider traffic through a gateway. You supply your provider credential; the gateway applies governance (blueprint + vector context injection, input/output policies, audit) and forwards upstream with your key. {gateway_id} is a UUID โ€” the gateway's ID, from Dashboard โ†’ Gateways.

All three passthrough endpoints share the same auth and gateway-resolution semantics:

  1. Extract caller auth โ€” Authorization: Bearer is checked first, then x-api-key. Bearer wins if both are present. Neither present โ†’ 401.
  2. Load the gateway by ID โ†’ 404 if not found.
  3. Gateway must be active โ†’ 503 if not.
  4. The target provider must exist, be enabled, and have passthrough enabled on the gateway โ†’ 403 otherwise.

POST/gw/{gateway_id}/v1/messages

Native Anthropic Messages protocol โ€” what Claude Code and the official Anthropic SDK speak when ANTHROPIC_BASE_URL points at the gateway. The request body is forwarded to Anthropic largely verbatim; the gateway only mutates system to prepend injected blueprint/vector context and normalizes the model field (stripping CC's client-only [1m]/[2m] suffix).

Auth: BYOK caller key (x-api-key or Authorization: Bearer), forwarded upstream. The gateway sets anthropic-version (defaulting to its negotiated version if you don't send one) and preserves your SDK fingerprint headers (anthropic-beta, x-stainless-*, user-agent) so abuse-detection upstream sees an unmodified client.

Streaming: Set "stream": true in the body. Anthropic SSE events (message_start, content_block_delta, message_stop, โ€ฆ) are piped back byte-for-byte. Upstream rate-limit headers (anthropic-ratelimit-unified-*, retry-after, request-id, x-should-retry) are forwarded to the caller.

curl https://gateway.unyform.ai/gw/<gateway_id>/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Refactor the circuit breaker."}
    ]
  }'
HTTP/1.1 200 OK
content-type: application/json
x-unyform-blueprints: 0b9c1f2a-...,7d4e8a01-...
x-unyform-blueprint-tokens: 1514
request-id: req_011C...

{
  "id": "msg_01XF...",
  "type": "message",
  "role": "assistant",
  "model": "claude-sonnet-4-6",
  "content": [{ "type": "text", "text": "Here's the refactor..." }],
  "usage": { "input_tokens": 1620, "output_tokens": 240 }
}

When blueprints are injected, the gateway adds two response headers: x-unyform-blueprints (comma-separated blueprint IDs) and x-unyform-blueprint-tokens (estimated tokens of injected context). They are absent when nothing was injected.

Warning

Requesting a Claude Opus model with a Claude Max OAuth token (sk-ant-oat01-โ€ฆ) is short-circuited with a 400 (invalid_request_error). Anthropic TLS-fingerprint-restricts this combo to the official Claude Code client, so it can never succeed through a proxy โ€” forwarding it only deepens the throttle on your token. Use a Console API key (sk-ant-api03-โ€ฆ) for Opus; Max OAuth still works for Sonnet and Haiku.

POST/gw/{gateway_id}/v1/messages/count_tokens

Anthropic's token-count preflight โ€” Claude Code calls this before every send to size the prompt against the model's context window. Forwards to Anthropic's /v1/messages/count_tokens. Blueprint and vector context injection run on the same payload as /v1/messages, so the returned count reflects what the real message will actually send. Never streamed (Anthropic returns a single JSON body).

Auth: identical to /v1/messages โ€” BYOK caller key, forwarded upstream.

curl https://gateway.unyform.ai/gw/<gateway_id>/v1/messages/count_tokens \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{"model":"claude-sonnet-4-6","messages":[{"role":"user","content":"hi"}]}'

POST/gw/{gateway_id}/v1/chat/completions

OpenAI-compatible Chat Completions passthrough, for tools that speak the OpenAI protocol. Unlike the Anthropic path, the request body is parsed into the gateway's internal request type and the full governance pipeline (provider selection, context injection, managed-tool loop, output policy) runs over it. Supports streaming via "stream": true.

Auth: BYOK caller key (Authorization: Bearer or x-api-key), forwarded to the OpenAI provider. Provider: the gateway must have an openai provider that is enabled with passthrough on.

curl https://gateway.unyform.ai/gw/<gateway_id>/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Refactor the circuit breaker."}]
  }'

Successful responses carry the same x-unyform-blueprints / x-unyform-blueprint-tokens headers when context was injected.


Single-tenant chat

POST/v1/chat/completions

OpenAI-compatible chat completion for a gateway configured in single-tenant (DB-backed) mode. Here the gateway uses its own stored, decrypted provider credentials โ€” you do not bring a key. The pipeline is the same as passthrough (blueprint injection โ†’ input policies โ†’ provider call โ†’ output policies โ†’ audit) but keyed to the API key's gateway.

Auth: gateway key. Send Authorization: Bearer uny_gw_โ€ฆ. The token is SHA-256 hashed and looked up in gateway_api_keys; an expired key is rejected. A missing or non-bearer Authorization header โ†’ 401 (authentication_error); an unknown or expired key โ†’ 401.

curl https://gateway.unyform.ai/v1/chat/completions \
  -H "Authorization: Bearer uny_gw_..." \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Refactor the circuit breaker."}]
  }'

Note

Same path, different gateway, different auth: /v1/chat/completions (no /gw/{id} prefix) is the single-tenant endpoint with a uny_gw_โ€ฆ key, while /gw/{gateway_id}/v1/chat/completions is BYOK passthrough with your own provider key. The URL prefix is what selects the mode.


Policies

POST/v1/orgs/{org}/policies/check

Evaluate text against an org's policies without making a model call โ€” used to preflight content from your own tooling. {org} is the org slug. The blocking contract (which severities and actions block) and request/response shape are documented with the policy model itself.

Auth: gateway key (API-key authenticated, org-scoped).

See Policies for the rule model and Write your first policy for the check endpoint in action.


Claude Code local plugin

These back the mx cc-plugin hooks. Both authenticate with a uny_gw_โ€ฆ gateway key (Authorization: Bearer), which resolves to a single gateway. See Use the Claude Code plugin for setup.

POST/v1/cc/session

Resolves and trims the gateway's attached blueprints into a <system-reminder> block for the plugin to emit. Body fields (user_message, max_total_tokens, max_per_blueprint_tokens) are optional.

POST/v1/cc/audit

Records a finished plugin session as a gateway_usage row. Backs mx cc-plugin stop.


Health

GET/health

Liveness and configuration probe. No auth. /ready and /v1/health are aliases of the same handler. Always returns 200 with a JSON status body:

{
  "status": "ok",
  "version": "0.1.0",
  "mode": "db",
  "providers": { "claude": true, "openai": false },
  "gateway": { "slug": "acme-prod", "db_connected": true, "config_loaded": true }
}

mode is db when the gateway is DB-backed (the gateway object is present) or legacy for env-var configuration (the gateway object is omitted).

Edit this page on GitHub