OpenAI-Compatible Proxy for Multi-Provider AI Models
OpenSourceAIHub exposes a single POST /v1/chat/completions endpoint that is fully compatible with the OpenAI API specification. Point the official OpenAI SDK — or any OpenAI-compatible library — at the Hub and gain instant access to 100+ models across 9 providers, with built-in PII redaction, budget enforcement, and smart cost routing.
Migrate your existing OpenAI integration in two lines of code: change the baseURL and the apiKey. Everything else — your model names, message format, streaming, function calling — works exactly the same.
Why This Matters
Directly integrating with each AI provider creates fragile, expensive architectures that are painful to maintain and impossible to govern centrally.
- ■Vendor lock-in — Direct API integrations tie your codebase to a single provider. Switching from OpenAI to Anthropic means rewriting every API call, message format, and error handler.
- ■SDK sprawl — Each provider has its own SDK, authentication scheme, and response format. Your dependency tree grows, and so does the surface area for breaking changes.
- ■No unified governance — PII filtering, cost limits, and audit logging must be reimplemented for every provider integration. Miss one and you have a compliance gap.
- ■Cost opacity — Comparing prices across providers requires manual spreadsheet work. You can't programmatically route to the cheapest option without building your own routing layer.
The Problem with Multiple AI APIs
A typical production application may use Groq for fast inference, OpenAI for complex reasoning, Anthropic for long-context tasks, and Mistral for code generation. Each requires its own integration:
| Provider | API Format | Auth Method |
|---|---|---|
| OpenAI | REST + SDK | Bearer token |
| Anthropic | Custom Messages API | x-api-key header |
| Groq | OpenAI-compatible | Bearer token |
| Together.ai | OpenAI-compatible | Bearer token |
| Google Gemini | Vertex / Gemini API | OAuth / API key |
| xAI | OpenAI-compatible | Bearer token |
| Mistral AI | OpenAI-compatible | Bearer token |
| AWS Bedrock | Bedrock API | AWS SigV4 |
| DeepInfra | OpenAI-compatible | Bearer token |
That is 9 SDKs, 4 different authentication schemes, 3 distinct API formats, and 9 separate error-handling paths. Each provider upgrade is a potential breaking change across your entire stack.
What Is an OpenAI-Compatible Proxy?
An OpenAI-compatible proxy accepts requests in the exact format the OpenAI API expects, then translates and routes them to the correct downstream provider. Your application code uses the standard OpenAI SDK — it never needs to know which provider is actually serving the request.
Single endpoint: POST https://api.opensourceaihub.ai/v1/chat/completions
What the proxy handles for you
- •Protocol translation — Converts the OpenAI message format to Anthropic Messages API, Google Vertex, AWS Bedrock SigV4, etc.
- •Authentication — One Hub API key replaces 9 provider credentials. In BYOK mode, your stored keys are decrypted and injected at request time.
- •PII redaction — The AI Firewall scans every request for 28 entity types before forwarding.
- •Budget enforcement — Pre-flight balance checks prevent overspending in Managed Mode.
- •Response normalization — All provider responses are returned in the standard OpenAI response format, regardless of the upstream provider.
Migrate in Two Lines of Code
If your application already uses the OpenAI SDK, migration is a two-line change — the apiKey and the baseURL. Toggle between “OpenAI Direct” and “Via Hub” to see exactly what changes:
import OpenAI from "openai";const client = new OpenAI({apiKey: process.env.OPENAI_API_KEY,// baseURL defaults to https://api.openai.com/v1});const response = await client.chat.completions.create({model: "gpt-4",messages: [{ role: "user", content: "Explain quantum computing" }],max_tokens: 512,});console.log(response.choices[0].message.content);
The same pattern works in every language. Here are the complete examples:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "os_hub_your_key_here",
baseURL: "https://api.opensourceaihub.ai/v1",
});
const response = await client.chat.completions.create({
model: "oah/llama-3-70b", // virtual model → smart-routed
messages: [
{ role: "user", content: "Explain quantum computing" }
],
max_tokens: 512,
});
console.log(response.choices[0].message.content);
// Response headers:
// x-hub-scan-ms: 12 (DLP scan time)
// x-hub-correlation-id: req_xxxx (audit trail)
// x-hub-model: llama-3-70b
// x-hub-provider: groqfrom openai import OpenAI
client = OpenAI(
api_key="os_hub_your_key_here",
base_url="https://api.opensourceaihub.ai/v1",
)
response = client.chat.completions.create(
model="oah/llama-3-70b", # virtual model → smart-routed
messages=[
{"role": "user", "content": "Explain quantum computing"}
],
max_tokens=512,
)
print(response.choices[0].message.content)from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="oah/gpt-4.1-mini",
openai_api_key="os_hub_your_key_here",
openai_api_base="https://api.opensourceaihub.ai/v1",
max_tokens=512,
)
response = llm.invoke("Explain quantum computing")
print(response.content)
# All LangChain features work: chains, agents, tools, streaming.
# PII redaction and budget enforcement happen transparently.curl -X POST https://api.opensourceaihub.ai/v1/chat/completions \
-H "Authorization: Bearer os_hub_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "oah/llama-3-70b",
"messages": [
{"role": "user", "content": "Explain quantum computing"}
],
"max_tokens": 512
}'Example Request and Response
{
"model": "oah/llama-3-70b",
"messages": [
{
"role": "user",
"content": "Explain quantum computing in simple terms"
}
],
"max_tokens": 256,
"temperature": 0.7
}{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1709234567,
"model": "oah/llama-3-70b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing uses quantum bits (qubits) instead of classical bits..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 87,
"total_tokens": 99
}
}Smart Provider Routing
When you use a virtual model name (prefixed with oah/), the Hub's Smart Router automatically selects the best provider for that request:
Cost optimization — The router indexes pricing across all providers that serve the requested model and selects the cheapest available option on a best-effort basis.
Availability-aware — If your primary provider is down or rate-limited, the router can fall back to an alternative that serves the same model (for open-source models available on multiple providers).
BYOK passthrough — If you have BYOK keys stored for a provider, the router will use your credentials (zero Hub cost). If not, it falls back to Managed Mode and deducts from your wallet.
Sovereign models
Open-weight models like Llama 4, DeepSeek, Qwen 3, and Mixtral are hosted by multiple providers. The Smart Router compares prices across Groq, Together.ai, DeepInfra, and others — then picks the best rate.
External models
Closed-source models like GPT-4.1 (OpenAI), Claude (Anthropic), Gemini (Google), and Grok (xAI) are only available from their creator. These are routed directly to the single provider — no routing decision needed.
// If you need a specific provider, use their native model ID:
const response = await client.chat.completions.create({
model: "groq/llama-3.3-70b-versatile", // forces Groq
messages: [{ role: "user", content: "Hello" }],
});
// Or force Together.ai:
// model: "together/meta-llama/Llama-3.3-70B-Instruct"
// The Hub still applies PII redaction and budget checks,
// but skips the Smart Router's provider selection.Supported OpenAI Features
The proxy supports the full /v1/chat/completions specification. Everything you use with the OpenAI SDK works through the Hub:
Benefits of a Unified AI Proxy
Zero vendor lock-in
Switch between providers by changing a model name — not rewriting your integration. Move from GPT-4.1 to Claude Sonnet 4.6 or Llama 4 Maverick without touching your SDK code.
Single dependency
One SDK (openai), one endpoint, one API key. Remove the Anthropic, Groq, Google, and Mistral SDKs from your dependency tree.
Automatic cost optimization
The Smart Router finds the cheapest provider for each open-source model. Combined with wallet enforcement, your spending is always visible and controlled.
Centralized governance
PII redaction, prompt injection detection, and DLP policies apply to every provider through one control plane — no per-provider reimplementation.
Built-in audit trail
Every request is tagged with a correlation ID, scan timing, model used, and provider selected. Debugging and compliance reporting are built in.
Future-proof
New providers and models are added to the Hub without any code changes on your side. Use them immediately via virtual model names.
Configuration Reference
For most applications, you only need two environment variables:
# Required: your Hub API key (os_hub_* or oah_* for project-scoped)
OPENAI_API_KEY=os_hub_your_key_here
# Required: Hub endpoint
OPENAI_BASE_URL=https://api.opensourceaihub.ai/v1
# Optional: default model for your application
DEFAULT_MODEL=oah/llama-3-70bMany frameworks (LangChain, LlamaIndex, Vercel AI SDK) read OPENAI_API_KEY and OPENAI_BASE_URL automatically. Setting these environment variables may be all you need — zero code changes.
Try the Unified API
Create an account, get your API key, and point your OpenAI SDK at the Hub. Every request is automatically scanned for PII, budget-checked, and routed to the best available provider — from your very first API call.
Related Documentation
- AI Gateway with PII Redaction — How 28-entity detection protects every request
- LLM Budget Enforcement — Token quotas, threshold alerts & recursive loop protection
- OpenRouter Alternative — AI gateway with built-in governance
- Vercel AI Gateway Alternative — Active security vs passive logging
- Quickstart — Connect your first application in 2 minutes
- Billing & Wallet Docs — Credit system, top-ups, and deduction mechanics
- Model Catalog — Pricing across 100+ models and 9 providers
- Enterprise Security & Trust Center
- Product Roadmap — Phase 1.1 Budget Enforcement & beyond