Skip to main content

Telemetry

What Bee Flow logs, how to ship it elsewhere, and what guarantees we make about PII in logs.

Implementation status

The sections below describe the target observability stack. Today's reality:

  • Logs: plain console.* JSON (no structured Pino logger wired yet). Captured by docker logs or the container runtime.
  • Metrics: only the ticket-assistant route exposes Prometheus-style metrics (server/routes/ticketAssistant.js). There is no global /metrics endpoint, no OTLP exporter, and no app-wide HTTP / DB instrumentation.
  • Audit log: real and complete — guardrail_events, admin_audit_events, automation_runs tables in Postgres. See Admin → Audit & compliance.

Sections explicitly marked Planned below are roadmap. Treat them as design intent, not current behaviour.

What's logged

Bee Flow's server emits structured JSON logs covering:

CategoryExamples
HTTP requestsmethod, path, status, duration_ms, requestId
Authlogin attempts, JWT issuance, OAuth callbacks (no tokens logged)
ChatconversationId, messageId, agentId, model, token usage
Tool callstoolName, durationMs, success boolean (no payloads at default level)
Errorsstack trace, requestId
Migrationswhich migration applied, duration
Background jobsNC sync, KB ingestion, audit retention purge
Licensetier, lastVerifiedAt, refresh ticks

At LOG_LEVEL=debug: also logs request bodies (PII-redacted via the same Privacy Shield used at runtime), full tool call args + results.

What's NEVER logged

  • OAuth access tokens / refresh tokens
  • Session JWTs
  • API keys
  • Licence private key material (we don't have it in the open-source server)
  • Plaintext PII matched by the Privacy Shield (only the categories + counts)

This is true at every log level.

Log destinations

ModeHow
LOG_FORMAT=json (default)Stdout — pipe to your log collector.
LOG_FORMAT=prettyHuman-readable — for local dev only.
SentrySet SENTRY_DSN. Errors go to Sentry; the rest stays in stdout.

Shipping to a SIEM / log store

Three good patterns:

1. Loki (Grafana stack)

# Docker Compose snippet
services:
beeflow-server:
logging:
driver: "loki"
options:
loki-url: http://loki:3100/loki/api/v1/push
loki-batch-size: "100"

Then query in Grafana with {app="beeflow"}.

2. Vector → Elastic / Splunk

Run Vector as a sidecar; configure source = Docker JSON logs, sink = your destination. Vector handles batching, retries, transformation.

3. CloudWatch (AWS)

services:
beeflow-server:
logging:
driver: awslogs
options:
awslogs-group: /beeflow/server
awslogs-region: eu-west-1

Audit log shipping (Enterprise+)

Different mechanism from app logs. Bee Flow can push guardrail events in real time to a webhook:

SettingValue
URLYour SIEM ingest endpoint
Shared secretHMAC-SHA256 secret
Severity filterlow / medium / high
Retry3× with exponential backoff

Configure in Admin → Audit & compliance → Webhooks. Format:

{
"id": "ev_abc",
"organizationId": "org_123",
"userId": "u_alice",
"agentId": "asst_xyz",
"violationType": "pii",
"violationCategories": "email,phone",
"direction": "input",
"actionTaken": "redact",
"model": "claude-opus-4-7",
"timestamp": "2026-05-09T13:30:00Z"
}

Verify the X-Beeflow-Sig HMAC on your end before processing.

Metrics (Prometheus) — Planned

A global /metrics endpoint is not yet exposed by the server. The only Prometheus-formatted endpoint that ships today is the ticket-assistant's internal /metrics (mounted under /api/ticket-assistant), gated by the ticket_assistant license feature.

The target series listed below are the design for an upcoming server-wide /metrics endpoint. If you need observability today, parse the JSON log stream into your metrics backend (Loki + recording rules works well).

Planned series:

MetricLabelsUnit
beeflow_http_requests_totalmethod, route, statuscounter
beeflow_http_request_duration_secondsmethod, route, statushistogram
beeflow_chat_turns_totaltier, modelcounter
beeflow_chat_turn_duration_secondstier, modelhistogram
beeflow_chat_tokens_totaldirection, modelcounter
beeflow_tool_calls_totaltool_name, statuscounter
beeflow_db_pool_in_use(none)gauge
beeflow_db_pool_max(none)gauge
beeflow_redis_connected(none)gauge (0/1)
beeflow_migrations_applied_total(none)counter
beeflow_guardrail_events_totalviolation_type, actioncounter
beeflow_license_activetiergauge (0/1 per tier)
beeflow_license_expires_at(none)gauge (unix seconds)
beeflow_users_active_total(none)gauge
beeflow_users_messages_month_total(none)counter

Tracing (OpenTelemetry) — Planned

OTLP exporter and span instrumentation are not yet wired. The intended design (HTTP requests, chat turns with model + tool sub-spans, DB queries, integration tool calls; configured via OTEL_EXPORTER_OTLP_ENDPOINT + OTEL_SERVICE_NAME) is on the roadmap.

Health checks

EndpointPurpose
GET /api/healthLiveness. Always returns 200 if the process is up.
GET /api/guard/healthGuard sidecar liveness — returns not-configured when GUARD_SERVICE_URL is unset.

Sub-endpoints for DB, Redis, and per-integration readiness are not currently exposed. For Kubernetes probes, point both livenessProbe and readinessProbe at /api/health; the server fails to start if Postgres is unreachable, so a 200 from /api/health is a reliable readiness signal.

Privacy / GDPR notes

For self-hosters running with EU users:

  • Set LOG_LEVEL=info (not debug). Debug-level logs include redacted but more verbose payloads — still no plaintext PII, but more inferable structure.
  • Set EU_MODE_ENABLED=true (org config). Anonymises IPs, drops User-Agent headers from log lines, shortens audit retention to default 30 days.
  • Strip request IDs from outgoing logs if you don't want them correlatable across services.
  • Ship logs to an EU-region log store.

Self-host opt-out for telemetry

Bee Flow does not phone home except for licence-server refresh checks. To disable that:

echo 'BEEFLOW_LICENSE_REFRESH_URL=' >> .env

The licence stays valid through exp regardless. This is the only outbound connection the server initiates without user action — every other outbound is in response to a user prompt or tool call.