Meta Llama 3 8B Instruct Reference
oah/llama-3-8b-chat-hfDeploy Meta Llama 3 8B Instruct Reference with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
by Meta (Open Source)
Meta's open-weights Llama family is the most widely deployed open-source LLM series. Compare Llama API pricing across Groq, Together, and DeepInfra to find the cheapest Llama provider. Llama 4 introduced mixture-of-experts (Maverick) and a long-context variant (Scout), while Llama 3.3 remains a cost-efficient workhorse for production workloads.
Every Llama request is scanned for 28+ PII entity types — SSNs, credit cards, emails, API keys, and more — before it reaches any provider.
Llama is available across 3 providers. Our Smart Router picks the cheapest one per-request. 25% managed markup / 0% on Pro BYOK.
Change two lines in your OpenAI SDK — base_url and api_key — and every request flows through the Hub. Full backward compatibility.
Per-request logging of token counts, latency, DLP violations, and cost. Never wonder what your AI spend is again.
oah/llama-3-8b-chat-hfDeploy Meta Llama 3 8B Instruct Reference with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/llama-3.1Deploy Meta Llama 3.1 405B Instruct with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/llama-3.2Deploy Llama 3.2 1B with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/llama-3.3Deploy Meta Llama 3.3 70B Instruct Turbo with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/llama-4-maverickDeploy Llama 4 Maverick Instruct (17Bx128E) FP8 with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/llama-4-scoutDeploy Llama 4 Scout Instruct (17Bx16E) with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/meta-llama-3-8b-instruct-liteDeploy Meta Llama 3 8B Instruct Lite with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/meta-llama-3.1Deploy Meta Llama 3.1 8B Instruct Turbo with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/nvidia/llama-3.3-nemotron-super-49bDeploy nim/nvidia/llama-3.3-nemotron-super-49b-v1 with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/llama-3.1-8b-instantDeploy llama-3.1-8b-instant with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/llama-3.3-70b-versatileDeploy llama-3.3-70b-versatile with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/hermes-3-llama-3.1Deploy NousResearch/Hermes-3-Llama-3.1-405B with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/deepseek-r1-distill-llamaDeploy deepseek-ai/DeepSeek-R1-Distill-Llama-70B with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/llama-3.2-11b-visionDeploy meta-llama/Llama-3.2-11B-Vision-Instruct with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/llama-guard-4Deploy meta-llama/Llama-Guard-4-12B with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/meta-llama-3Deploy meta-llama/Meta-Llama-3-8B-Instruct with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/llama-3.1-nemotronDeploy nvidia/Llama-3.1-Nemotron-70B-Instruct with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
oah/llama-3.3-nemotron-super-49bDeploy nvidia/Llama-3.3-Nemotron-Super-49B-v1.5 with built-in PII redaction and Hub governance. Available on Managed Credits and BYOK.
Input / Output pricing by provider. Managed Mode adds a 25% managed markup. Pro BYOK = 0% markup.
| Model | Params | Context | Vision | Together.ai | DeepInfra | Groq |
|---|---|---|---|---|---|---|
Meta Llama 3 8B Instruct Reference oah/llama-3-8b-chat-hf | — | 8K | No | $0.20/$0.20 | — | — |
Meta Llama 3.1 405B Instruct oah/llama-3.1 | — | 4K | No | $3.50/$3.50 | — | — |
Llama 3.2 1B oah/llama-3.2 | — | 131K | No | $0.06/$0.06 | — | — |
Meta Llama 3.3 70B Instruct Turbo oah/llama-3.3 | — | 131K | No | $0.88/$0.88 | $0.13/$0.39 | — |
Llama 4 Maverick Instruct (17Bx128E) FP8 oah/llama-4-maverick | — | 1.0M | No | $0.27/$0.85 | $0.15/$0.60 | — |
Llama 4 Scout Instruct (17Bx16E) oah/llama-4-scout | — | 1.0M | No | $0.18/$0.59 | $0.15/$0.45 | $0.11/$0.34 |
Meta Llama 3 8B Instruct Lite oah/meta-llama-3-8b-instruct-lite | — | 8K | No | $0.10/$0.10 | — | — |
Meta Llama 3.1 8B Instruct Turbo oah/meta-llama-3.1 | — | 131K | No | $0.18/$0.18 | $0.06/$0.06 | — |
nim/nvidia/llama-3.3-nemotron-super-49b-v1 oah/nvidia/llama-3.3-nemotron-super-49b | — | 16K | No | Free/Free | — | — |
llama-3.1-8b-instant oah/llama-3.1-8b-instant | — | 131K | No | — | — | $0.05/$0.08 |
llama-3.3-70b-versatile oah/llama-3.3-70b-versatile | — | 131K | No | — | — | $0.59/$0.79 |
NousResearch/Hermes-3-Llama-3.1-405B oah/hermes-3-llama-3.1 | — | — | No | — | $0.30/$0.30 | — |
deepseek-ai/DeepSeek-R1-Distill-Llama-70B oah/deepseek-r1-distill-llama | — | — | No | — | $0.20/$0.60 | — |
meta-llama/Llama-3.2-11B-Vision-Instruct oah/llama-3.2-11b-vision | — | — | No | — | $0.05/$0.05 | — |
meta-llama/Llama-Guard-4-12B oah/llama-guard-4 | — | — | No | — | $0.18/$0.18 | — |
meta-llama/Meta-Llama-3-8B-Instruct oah/meta-llama-3 | — | — | No | — | $0.03/$0.06 | — |
nvidia/Llama-3.1-Nemotron-70B-Instruct oah/llama-3.1-nemotron | — | — | No | — | $0.60/$0.60 | — |
nvidia/Llama-3.3-Nemotron-Super-49B-v1.5 oah/llama-3.3-nemotron-super-49b | — | — | No | — | $0.10/$0.40 | — |
What you get at each pricing tier. Hub adds security, governance, and multi-provider routing on top of raw API access.
| Mode | What You Pay | PII Redaction | Budget Caps | Routing | Audit Trail |
|---|---|---|---|---|---|
| Direct to Meta | Provider pricing only | None | None | Manual | None |
| Hub — Managed Mode | Provider + 25% markup | 28+ PII types | Per-key hard caps | Smart Router | Full compliance log |
| Hub — Pro BYOK ($29/mo) | Direct to provider (0% markup) | 28+ PII types | Per-key hard caps | Smart Router | Full compliance log |
Privacy-sensitive deployments requiring model auditability
Cost-optimized chatbots and customer support agents
Long-document summarization and analysis (Scout 512K context)
Multi-provider redundancy with automatic failover
from openai import OpenAI
client = OpenAI(
base_url="https://api.opensourceaihub.ai/v1",
api_key="your_hub_api_key"
)
# Use any virtual model name from the pricing table above
response = client.chat.completions.create(
model="oah/llama-3-8b-chat-hf",
messages=[{"role": "user", "content": "Hello!"}]
)Use any virtual model name from the pricing table above (prefixed with oah/). Works with the standard OpenAI SDK. Every request is PII-scanned before reaching Meta (Open Source).
Get started with 1,000,000 free credits. Every Llama request is PII-scanned, cost-optimized, and fully logged — zero configuration.
Not ready yet? Get notified about Llama updates:
OpenAI's GPT family powers the majority of commercial AI applications. Compare GPT-4 API cost and OpenAI API pricing acr…
Google's Gemini family offers powerful multimodal capabilities with large context windows. Compare Gemini API pricing an…
Anthropic's Claude family is built with safety and reliability at its core. Compare Claude API pricing and Claude Sonnet…
DeepSeek has rapidly risen as a leading open-source model family, known for exceptional coding performance and cost effi…
Mistral AI's model family spans from compact open-weights models to powerful commercial variants. Compare Mistral API pr…
Model registry last updated: . Pricing shown is the lowest available rate across providers (per 1M tokens, USD). Actual pricing depends on provider and plan.