AI Gateway with PII Redaction for LLM Applications
Every time a developer sends a prompt to an LLM, there is a risk that the request contains personally identifiable information — email addresses, Social Security numbers, customer records, internal API keys, or medical data. Without an interception layer, that sensitive data flows directly to a third-party model provider and may be logged, cached, or used for training.
An AI Gateway with PII Redaction sits between your application and the LLM provider. It scans every prompt in real time, detects 28 entity types using a multi-layered AI Firewall, and either redacts or blocks the sensitive content — before the request ever leaves your infrastructure.
Why This Matters
Sending unfiltered prompts to LLMs creates regulatory, reputational, and security exposure that grows with every API call your organization makes.
- ■Regulatory violations — GDPR, HIPAA, CCPA, and PCI-DSS all impose fines for exposing protected data to unauthorized third-party processors.
- ■Training data leaks — Some providers may use API inputs for model fine-tuning, embedding your sensitive data permanently into their weights.
- ■Prompt logging — Provider-side request logs can persist for weeks. A single prompt containing an SSN or credit card number creates an indefinite liability window.
- ■Internal secret exposure — Developers routinely paste code snippets containing API keys, AWS credentials, and database connection strings into prompts.
Why PII Protection Matters for LLMs
Consider a common scenario in a healthcare application:
{
"model": "gpt-4.1",
"messages": [
{
"role": "user",
"content": "Summarize this patient record: John Smith, SSN 123-45-6789, DOB 03/15/1982, diagnosed with Type 2 diabetes on 01/10/2025. Prescribed Metformin 500mg."
}
]
}Without a gateway, this prompt — containing a real name, SSN, date of birth, and medical diagnosis — is sent unmodified to the model provider's servers.
Architecture: AI Gateway with PII Detection
OpenSourceAIHub implements a multi-stage pipeline that inspects every request before it reaches any downstream provider:
Prompt enters the gateway — Your application sends a standard OpenAI-compatible request to the Hub endpoint instead of directly to a provider.
PII entities detected — The AI Firewall scans every message field for 28 entity types using a combination of pattern matching, checksums, intelligent entity recognition, and context-aware heuristics.
Policies applied — Each detected entity is matched against the project's DLP policy. Per-entity rules determine whether to REDACT, BLOCK, or LOG the match.
Prompt redacted or blocked — Matched entities are replaced with type-safe tokens (e.g., [EMAIL_ADDRESS], [US_SSN]). If a BLOCK-level entity is found (like a prompt injection), the entire request is rejected with a 400 response.
Request forwarded — The cleaned prompt is routed to the selected LLM provider. The provider never sees the original sensitive data.
The AI Firewall Detection Engine
Unlike simple regex-based filters, the OpenSourceAIHub firewall uses a multi-layered detection engine that combines four techniques to minimize false negatives without sacrificing latency:
Pattern Matching
High-precision patterns for structured formats like SSNs (XXX-XX-XXXX), credit card numbers (Luhn-validated), and API key prefixes (sk-*, ghp_*, AKIA*).
Checksum Validation
Luhn algorithm for credit cards, mod-check for IBANs, and format-specific validation to eliminate false positives from random digit sequences.
Intelligent Entity Recognition
Identifies person names, locations, organizations, and dates that don't follow fixed patterns — catching 'John Smith' where pattern matching alone can't.
Context Heuristics
Surrounding text analysis to disambiguate. A 9-digit number near 'SSN' or 'social security' is scored higher than an isolated digit sequence.
This combined approach adds fewer than 50 milliseconds of latency per request for text and approximately 0.5–1 second for image payloads (vision security). Every response includes x-hub-scan-ms timing headers so you can verify performance in production.
Example: PII Redaction Flow
Before — Raw prompt
Send this email to john.doe@email.com about invoice #99342 for customer James Wilson, card ending 4242-4242-4242-4242.
After — Redacted prompt
Send this email to [EMAIL_ADDRESS] about invoice #99342 for customer [PERSON], card ending [CREDIT_CARD].
Supported Entity Types (28 total)
API_KEYAWS_ACCESS_KEYAWS_SECRET_KEYPRIVATE_KEYGITHUB_TOKENSLACK_WEBHOOKCREDIT_CARDIBAN_CODEUS_BANK_NUMBERCRYPTO_ADDRESSUS_ITINEMAIL_ADDRESSPHONE_NUMBERUS_SSNUS_PASSPORTPERSONSTREET_ADDRESSDATE_TIMENRPUK_NINOUK_NHS_NUMBERIP_ADDRESSMAC_ADDRESSLOCATIONURLMEDICAL_LICENSEUS_DRIVER_LICENSEPROMPT_INJECTION(blocked, not redacted)Implementing PII Redaction in an AI Gateway
OpenSourceAIHub is a drop-in replacement for the OpenAI API. Point your existing SDK at the Hub endpoint and PII redaction happens automatically — no code changes beyond the base URL and API key:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "os_hub_your_key_here",
baseURL: "https://api.opensourceaihub.ai/v1",
});
const response = await client.chat.completions.create({
model: "oah/gpt-4.1",
messages: [
{
role: "user",
content: "Summarize this patient record: John Smith, SSN 123-45-6789",
},
],
});
// The Hub automatically:
// 1. Scans the prompt for PII entities
// 2. Redacts "John Smith" → [PERSON], "123-45-6789" → [US_SSN]
// 3. Forwards the cleaned prompt to the provider
// 4. Returns the response with x-hub-scan-ms timing headerfrom openai import OpenAI
client = OpenAI(
api_key="os_hub_your_key_here",
base_url="https://api.opensourceaihub.ai/v1",
)
response = client.chat.completions.create(
model="oah/gpt-4.1",
messages=[
{
"role": "user",
"content": "Summarize this patient record: John Smith, SSN 123-45-6789",
}
],
)
# PII is redacted before the request reaches OpenAI.
# Check response headers for scan timing and violation counts.curl -X POST https://api.opensourceaihub.ai/v1/chat/completions \
-H "Authorization: Bearer os_hub_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "oah/gpt-4.1",
"messages": [
{
"role": "user",
"content": "Summarize this patient record: John Smith, SSN 123-45-6789"
}
]
}'
# Response headers include:
# x-hub-scan-ms: 12
# x-hub-violations: PERSON,US_SSN
# x-hub-correlation-id: req_xxxxZero-config protection: Every request is scanned against the default “Maximum Protection” policy that covers all 28 entity types. For granular control, create custom DLP policies per project in the dashboard.
Violation Response Format
When the firewall detects a BLOCK-level entity (like a prompt injection attempt), it rejects the request immediately with a structured error response:
{
"error": {
"message": "Security policy violation: request blocked.",
"type": "security_violation",
"code": 400,
"violations": [
{
"entity": "PROMPT_INJECTION",
"action": "BLOCK",
"start": 0,
"end": 47
},
{
"entity": "US_SSN",
"action": "REDACT",
"start": 52,
"end": 63
}
],
"correlation_id": "req_a1b2c3d4"
}
}Supported AI Providers
PII redaction works identically across all providers. Use a single gateway endpoint and virtual model names (oah/*) — the Hub handles provider routing automatically:
Benefits of an AI Security Gateway
Prevent data leaks
PII is redacted before it leaves your infrastructure. The model provider never sees raw sensitive data.
Central policy enforcement
Define DLP policies once and apply them to every model, every provider, every request — from a single dashboard.
Provider-agnostic governance
Switch between OpenAI, Groq, Anthropic, or any provider. The same security policies follow your traffic.
Audit logging
Every scan result is logged with entity types, actions taken, and correlation IDs — ready for compliance audits.
Compliance readiness
Demonstrate GDPR, HIPAA, and PCI-DSS controls with documented, automated PII handling across all AI integrations.
Sub-50ms latency
The firewall adds fewer than 50ms per text request. Verify with the x-hub-scan-ms response header.
Try It with OpenSourceAIHub
Get started in under five minutes. No credit card required for the free tier — every request is protected from your very first API call.
Related Documentation
- LLM Budget Enforcement — Token quotas, threshold alerts & recursive loop protection
- OpenAI-Compatible Proxy — Drop-in replacement for the OpenAI SDK
- OpenRouter Alternative — AI gateway with built-in governance
- Vercel AI Gateway Alternative — Active security vs passive logging
- AI Firewall (DLP) — Full entity reference and policy configuration
- Quickstart — Connect your first application in 2 minutes
- Billing & Wallet Docs — Credit system, top-ups, and deduction mechanics
- Model Catalog — Pricing across 100+ models and 9 providers
- Enterprise Security & Trust Center
- Product Roadmap — Phase 1.1 Budget Enforcement & beyond