How to Add Enterprise-Grade AI DLP to Your App in 60 Seconds

Your users are pasting Social Security numbers, API keys, and medical records into AI prompts right now. Every unprotected call to an LLM is a potential data breach — one that reaches a third-party server before anyone on your team can review it.

The solution is AI Data Loss Prevention (DLP): an inline firewall that scans every prompt for sensitive data and either redacts or blocks it before the request ever reaches the model provider.

With OpenSourceAIHub, adding enterprise-grade AI PII protection to any application takes a single code change — swap the base URL, and every request is automatically scanned for 28 entity types across 100+ models from 9 providers.

Not sure if your prompts are leaking data? Try our free AI Leak Checker — paste any prompt and see what a real DLP engine detects. No account required.

The 60-Second Integration

OpenSourceAIHub is an OpenAI-compatible proxy. That means any application using the OpenAI SDK — whether it's Python, Node.js, Go, or a cURL script — can add LLM data loss prevention by changing two lines of configuration.

Create a free account at opensourceaihub.ai. You receive 1,000,000 free credits — enough for thousands of protected API calls. Generate a Hub API key from the API Keys page.

Swap the base URL

Replace your existing provider's base URL with the Hub endpoint. Your existing model names continue to work — or use the oah/ prefix for smart routing across providers.

Deploy — you're protected

Every request is now scanned for 28 sensitive entity types. PII is redacted before it reaches the model provider. Check the response headers for scan timing and violation counts.

Before — Direct to OpenAI (unprotected)

from openai import OpenAI

client = OpenAI(
    api_key="sk-your-openai-key",
    # base_url defaults to https://api.openai.com/v1
)

# ⚠️ This prompt reaches OpenAI's servers with raw PII:
response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "Summarize: John Smith, SSN 123-45-6789"}],
)

↓

After — Through OpenSourceAIHub (protected)

from openai import OpenAI

client = OpenAI(
    api_key="os_hub_your_key_here",                  # ← Hub key
    base_url="https://api.opensourceaihub.ai/v1",    # ← Hub endpoint
)

# ✅ PII is automatically redacted before reaching the provider:
# "John Smith" → [PERSON], "123-45-6789" → [US_SSN]
response = client.chat.completions.create(
    model="oah/gpt-4.1",
    messages=[{"role": "user", "content": "Summarize: John Smith, SSN 123-45-6789"}],
)

# Response headers include:
# x-hub-scan-ms: 12
# x-hub-violations: PERSON,US_SSN

That's it. Two lines changed. Every subsequent request — whether it's a customer support bot, a code assistant, or an internal analytics tool — is now scanned for PII before it leaves your infrastructure.

What Gets Detected

The OpenSourceAIHub AI Firewall detects 28 entity types using a multi-layered engine combining pattern matching, checksum validation, intelligent entity recognition, and context-aware heuristics. For a deep technical dive into how the engine works, see our AI Gateway with PII Redaction guide.

Personal Data

Names, emails, phone numbers, SSNs, passport numbers, dates of birth, street addresses

Financial Data

Credit card numbers (Luhn-validated), IBANs, bank account numbers, crypto wallet addresses

Developer Secrets

API keys (sk-*, ghp_*, AKIA*), AWS secret keys, private keys, Slack webhooks, GitHub tokens

Network & Medical

IP addresses, MAC addresses, medical license numbers, driver's licenses, UK NINO/NHS numbers

Plus prompt injection defense: the firewall detects and blocks jailbreak attempts before they reach the model, returning a structured 400 error with the specific violation details.

Node.js Integration

The same pattern works with the OpenAI Node.js SDK. For a complete SDK integration guide including error handling, see OpenAI-Compatible Proxy.

Node.js — Enterprise AI DLP in two lines

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "os_hub_your_key_here",
  baseURL: "https://api.opensourceaihub.ai/v1",
});

const chat = await client.chat.completions.create({
  model: "oah/llama-3.3-70b",   // Smart-routed across providers
  messages: [
    {
      role: "user",
      content: "Draft an email to jane.doe@acme.com about customer " +
               "James Wilson, card 4242-4242-4242-4242.",
    },
  ],
});

// The provider receives:
// "Draft an email to [EMAIL_ADDRESS] about customer
//  [PERSON], card [CREDIT_CARD]."
console.log(chat.choices[0].message.content);

What Makes This Enterprise-Grade

Regex-based filters miss most real-world PII. The OpenSourceAIHub firewall is built for production workloads that demand accuracy, speed, and auditability.

Per-Project DLP Policies

Configure entity-level rules per project: REDACT emails but BLOCK credit cards. Apply different policies to staging vs production environments.

Sub-50ms Scan Latency

The firewall adds ~30–50ms per text request. Every response includes an x-hub-scan-ms header so you can verify in production.

Compliance-Ready Audit Logs

Every scan result is logged with entity types, actions taken, timestamps, and correlation IDs — ready for GDPR, HIPAA, and PCI-DSS audits.

Provider-Agnostic

The same DLP policies apply whether you route to OpenAI, Anthropic, Groq, Together.ai, Google Gemini, Mistral, xAI, or AWS Bedrock.

Smart Router Cost Savings

The Hub's smart router automatically selects the cheapest provider for each model family. Typical savings: 40–60% vs direct provider pricing.

Wallet & Budget Controls

Pre-loaded credit wallet with per-project budget limits. No surprise bills — requests are rejected when the budget is exhausted.

For a detailed breakdown of budget enforcement, token quotas, and threshold alerts, see LLM Budget Enforcement.

See It in Action

Before writing any code, you can test the DLP engine directly in your browser:

Free AI Leak Checker

Paste any prompt and see exactly what PII the engine detects — entity types, risk level, and a redacted version of your text. No signup required.

Try the Leak Checker →

Interactive Playground

After signing up, test the full pipeline in the Playground: send a prompt with PII, watch it get redacted in real time, and see the clean response from the model.

Real-World Use Cases for AI DLP

■

Healthcare AI assistants

Patient names, SSNs, diagnoses, and medication records are redacted before reaching the model. HIPAA compliance is maintained without slowing down the clinical workflow.

■

Customer support chatbots

Support agents paste customer records into AI tools daily. DLP ensures credit card numbers, email addresses, and account IDs never reach the model provider's logs.

■

Developer copilots & code assistants

Developers routinely paste code containing API keys, database connection strings, and AWS credentials. The firewall catches these before they're sent to the AI provider.

■

Legal document summarization

Contracts and case files contain names, addresses, SSNs, and financial data. DLP redacts all PII while preserving the document structure the model needs to generate useful summaries.

■

Financial analysis & reporting

Internal financial data, account numbers, and transaction details are protected when teams use AI for report generation and data analysis.

Frequently Asked Questions

Does this work with my existing OpenAI / Anthropic / Groq integration?

Yes. OpenSourceAIHub is a drop-in replacement. Change the base URL and API key — no other code changes needed. Existing model names work, or use the oah/ prefix for smart routing.

How much latency does the DLP scan add?

Typically 30–50ms for text requests. Every response includes the x-hub-scan-ms header so you can measure it in production. Image/vision requests add ~0.5–1 second.

Can I customize which entities are redacted vs blocked?

Absolutely. Create per-project DLP policies in the dashboard. Set each entity type to REDACT, BLOCK, or LOG independently. Different projects can have different policies.

Is my data stored?

Prompts and responses are never persisted. Only metadata (timestamps, entity types detected, latency, cost) is retained for your dashboard analytics. See our Security & Trust Center for details.

What happens when a BLOCK-level entity is detected?

The request is rejected immediately with a 400 response containing the violation details (entity type, position, correlation ID). The prompt is never forwarded to the model provider.

Start Protecting Your AI Calls Today

Every unprotected API call is a liability. Add enterprise-grade AI DLP in 60 seconds — free tier includes 1,000,000 credits, no credit card required.