Anthropic
Claude access governed by policy, keys, cost, and audit.
Multi-provider gateway for governed AI access. Route across Claude, GPT, Gemini, Grok, and Sonar through one governed access layer — with OpenAI-compatible request shapes, provider-key isolation, policy controls, per-request audit, and cost tracking. PKCE-flow browser sessions mean no provider keys ever leak into client code.
Provider coverage
Use provider diversity without creating five separate key-management, policy, cost, and audit surfaces. Route requests across leading AI providers through one governed access layer — with policy controls, per-request audit, cost tracking, and provider-key isolation. As provider model catalogs change, policies and routing can be updated centrally instead of forcing every application team to chase model-name changes.
Claude access governed by policy, keys, cost, and audit.
GPT-family routing with scoped access and spend visibility.
Gemini access through the same governed request plane.
Grok-family access with central policy and audit controls.
Sonar access for search-grounded and reasoning workflows.
For platform teams
Add or swap providers without changing application code. Drop-in OpenAI-compatible request shape; provider routing is handled by the gateway, not by each service.
For security teams
API keys never reach client code. Per-team and per-environment policies. Audit trail per request with model, verdict, cost, and correlation IDs. Redaction and retention are configurable.
For finance teams
Spend visibility before procurement asks. Caps that stop runaway costs. Chargeback-ready usage exports across providers, teams, and environments.
Every team using AI right now has the same problem: one project on Anthropic, another on OpenAI, a third experiment with Gemini, a Slack bot calling xAI. Five vendor relationships, five sets of API keys, five different cost dashboards, no unified audit trail. When the CISO asks "what's our AI exposure?", you can't answer without piecing together five spreadsheets.
LLM Gateway is OpenAI-compatible at the request shape, so existing code works with a single environment-variable change. What you gain on top of "switch base URL" is governance.
Policy-based routing can select models based on cost, capability, and availability rules. Pin a specific model when a workflow requires it. Failover behavior is configured per engagement.
Issue scoped tsg-* keys per org or per team. Restrict which models each key can call. Rotate or revoke without touching upstream provider keys.
Every request returns flattened token usage and cost. Cap monthly spend per key, per team, or per org. Usage dashboards where configured; no need to wait for provider invoices to understand spend.
Records capture model, latency, cost, team, policy verdict, and correlation IDs. Prompt/response capture, retention, and SIEM export are configured per engagement. Hash-chained logs provide tamper-evident sequencing for audit records where configured.
Browser apps exchange short-lived PKCE tokens server-side. Provider API keys never reach client code. Compliant with the way modern OAuth identity flows work.
Optional idempotency keys deduplicate retries. Optional conversation memory persists context server-side so you don't pay to resend it on every turn.
Architecture at a glance
Two equivalent request shapes. Pick whichever matches your existing code. The response shape is the same.
If your code already speaks OpenAI's chat-completions shape, send a system role message — exactly like you would to api.openai.com. Only the base URL and auth header change.
curl -X POST https://llmgateway.threadsync.io/v1/chat/completions \
-H "x-api-key: tsg-..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"max_tokens": 1024,
"messages": [
{"role": "system", "content": "You are a careful assistant."},
{"role": "user", "content": "Summarize this contract..."}
]
}'
If you prefer the Anthropic-style shape (separate system field), the gateway accepts that too. Equivalent semantics; pick whichever your code already uses.
curl -X POST https://llmgateway.threadsync.io/v1/chat/completions \
-H "x-api-key: tsg-..." \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"system": "You are a careful assistant.",
"messages": [
{"role": "user", "content": "Summarize this contract..."}
]
}'
{
"content": "...",
"model": "claude-opus-4-7",
"provider": "anthropic",
"usage": {"input_tokens": 412, "output_tokens": 184, "cost_usd": 0.0093}
}
Model names are examples. Allowed providers and models are configured per engagement and may vary by provider terms, region, and customer policy.
Both modes are valid for Claude, GPT, Gemini, Grok, and Sonar. The gateway flattens response payloads so your code reads data.content regardless of provider.
LLM Gateway is delivered as part of a ThreadSync platform engagement. Scope, capacity, and terms are set per engagement — talk to us about access.
Once partner access and provider credentials are provisioned, the developer flow can run same-day:
tsg-* key + set policyUse the admin API or the workspace UI to create a key, scope it to an org, and define which models and monthly budget it can use. Keys are hot-rotatable; revocation takes effect on the next request after propagation completes.
Anthropic / OpenAI / Google / xAI / Perplexity keys live server-side in the gateway, never in your client code. Per-provider quota and routing rules are configured once.
llmgateway.threadsync.ioExisting OpenAI-compatible code works as-is — just swap the base URL and auth header. Use role-based system messages in OpenAI-compatible mode, or the top-level system field in ThreadSync normalized mode. Response is flattened to data.content.
Every request shows provider, model, tokens, cost, latency, and policy verdict. Filter by key, team, or org. SIEM webhook and scheduled S3 export are available where configured per engagement.
LLM Gateway is the developer-facing surface. Magic Runtime uses the gateway for AI calls inside its sandboxed execution layer. Lift exposes the gateway through a workspace UI for design-partner teams that prefer a UI to API code. All three share the same governance engine — engagement scope determines which surface a partner uses.
OpenAI-compatible. Provider-governed. Request-level audit records. Access is engagement-only.