DLP & guardrails
Requires an Enterprise or higher licence key.
DLP (Data Loss Prevention) is the org-wide policy layer on top of the Privacy Shield. Where the Privacy Shield silently redacts, DLP can block, alert, or interactively prompt.
What you can do
- Block prompts containing certain content (e.g. specific project codenames, internal classification labels).
- Alert org admins on policy hits via mail / Talk / webhook.
- Export an audit log of every blocked or redacted message for compliance review.
- Per-group exceptions — Legal can mention contract numbers, others can't.
- Per-agent exceptions — Customer-support agent never sees customer names plaintext.
- External-vs-internal model classification — different rules when the prompt goes to Anthropic SaaS vs your own self-hosted Ollama.
Action types
DLP scans return one of four actions:
| Action | Effect |
|---|---|
allow | Send prompt as-is to the model. |
redact | Tokenise findings (using the Privacy Shield mechanism), send tokenised text, restore on response. |
block | Hard-reject the turn. User sees an error: "Your message contains sensitive content (category) and was blocked by org policy." |
ask | Interactive gate — show user findings, wait for them to choose redact / allow / block. |
Modes
The org-level dlpMode field selects the default behaviour for findings:
| Mode | When you'd pick it |
|---|---|
ask | Knowledge-worker org. Users choose. |
auto_redact | Customer-support / call-centre. Tokenise silently, never expose to model. |
block | Legal / Finance / R&D. Refuse the turn entirely if forbidden content is found. |
A dlpFailureMode controls what happens if the DLP engine itself fails (network blip to Azure):
fail_open— assumeallow, log a warning. Default foraskmode.fail_closed— assumeblock. Default forblockmode.
Execution flow
- User submits a turn.
dlpRunner.scanOutbound()runs the prompt through PII detection + custom-term scanner.- The runner classifies the target LLM provider (
classifyProvider()— external vs internal vs known-region). - Org config (
dlpEnabled,dlpMode,dlpFailureMode) is consulted. - Result
{action, findings, redactedText, tokenMap}is returned. - The chat handler implements the action:
allow→ forward the original.redact→ forwardredactedText, stashtokenMap.block→ emit an error event to the SSE stream and stop.ask→ emit adlp_findingevent; wait for the user'sPOST /api/chat/:msgId/dlp-decision.
Audit log
Every match is recorded in the guardrail_events table:
| Field | Notes |
|---|---|
id | UUID |
organization_id | Org tenant. |
user_id | Who sent the prompt. |
agent_id | Which agent. |
conversation_id | Which conversation. |
violation_type | pii / custom_term / moderation / unicode_smuggling. |
violation_categories | Comma-joined labels (e.g. email,phone,bsn). |
direction | input (prompt) or output (model reply / tool result). |
action_taken | block / redact / alert / stripped. |
source | chat / automation / unknown. |
model | Provider + model name. |
timestamp | UTC. |
The plaintext sensitive content is never stored — only the categories and the action.
Querying the audit log
GET /api/guardrails/events?org=<id>&from=2026-04-01&to=2026-05-01&type=pii
Authorization: Bearer <admin_jwt>
Filter by user, agent, type, action, time window. CSV export is available at /api/guardrails/events.csv.
Webhook export to SIEM
Real-time push to your SIEM:
sudo -u www-data php occ ... # not applicable — use Bee Flow admin UI
In Settings → Organisation → Audit → Webhooks add a target URL. Each event is POSTed there as JSON with HMAC signature using a shared secret. Retries 3× with exponential backoff.
Web-search guardrail
If webSearchGuardEnabled is on (default), web-search results are passed through the same PII detector before they're injected into the agent's context. Useful when an agent searches news / Wikipedia and the snippets contain personal data — that data shouldn't reach the model unredacted.
Unicode-smuggling protection
Some prompt-injection attacks use zero-width Unicode characters (U+200B–U+200F, U+2060) to hide instructions. Bee Flow strips these from incoming payloads and logs each occurrence as a unicode_smuggling event. No user-visible action; pure defence.
Moderation backend
Optional output moderation runs every model reply through Azure Content Safety (AZURE_CONTENT_SAFETY_*). If the moderator returns a violation, the reply is redacted (or blocked, depending on policy), and an audit log row is written. Useful for customer-facing agents where the org has reputational exposure.
Policy editor
Policies are defined in Settings → Organisation → DLP:
- Built-in PII categories — toggle each on / off, set action per category.
- Custom regex categories — name, pattern, category label, action.
- Per-group overrides — exempt a group from specific categories.
- Per-agent overrides — pin a specific policy to an agent.
The policy applies to all chat + automation runs in the org.
Compliance use cases
- GDPR Article 32 — demonstrate organisational measures by showing the full guardrail audit log.
- HIPAA — Standard mode + custom regex for ICD codes; export quarterly.
- PCI-DSS — auto-block credit-card numbers in any prompt, even if the user typed them by accident.
- NDA-protected projects — custom regex for codenames; block.
Where to next
- Privacy shield — the underlying detection engine.
- Admin → Audit & compliance — UI + retention policy.
- Reference → Telemetry — shipping events to your SIEM.