Audit Logging System

DiGAV §4(3) / DSGVO Art. 9 / Art. 5(2) design doc — audit_logs schema, RLS policies, retention, implementation priority. Zero audit infra exists today.

Status: Design doc. Zero audit infrastructure exists in production today. Current as of 2026-04-18. Audience: DiGA reviewer, BfArM consultation, implementation engineer. Primary driver: DiGA filing blocker — audit logging is mandatory under DiGAV §4(3) and DSGVO accountability requirements.

1. Why this matters

Regulatory obligations

Regulation	Requirement
DiGAV §4(3)	All access to patient health data must be logged and attributable to an actor
DSGVO Art. 9	Health data is a special category — processing requires provable lawful basis + traceability
DSGVO Art. 5(2)	Controller must be able to demonstrate compliance (accountability principle)
DSGVO Art. 15	Data subject has right to know what data is processed about them — requires access log
DSGVO Art. 17	Right to erasure — erasure events themselves must be recorded
BSI TR-03161	Data security certificate mandatory for DiGA since Jan 2025 — requires auditability

Current state (verified against repo at 2026-04-18)

No audit_logs table. supabase/migrations/ contains 5 migrations (latest 20260407_community_post_limit_2.sql) — none create an audit schema.
No logging middleware. apps/web/lib/supabase/middleware.ts handles auth + security headers only.
Admin actions are silent. Ban / moderation / role-change routes have no audit trail.
Chat with document context is silent. /api/chat/route.ts injects master.md into LLM context (Plus/Premium gate via checkFeatureAccess) with no record of which document lines were read.
Document CRUD is silent. All 10 sub-routes under apps/web/app/api/documents/* operate through RLS but log nothing centrally.

This is the single largest DiGA filing gap.

2. Events that MUST be logged

Organized by regulatory priority:

P0 — PHI access (DiGAV §4(3))

Event	Source	Why
`document.upload`	`/api/documents/upload-url` + `confirm-upload`	Patient uploads PHI
`document.read`	`/api/documents/[documentId]` GET + `download-url`	PHI read
`document.extract_approved`	`/api/documents/approve-extraction`	Patient accepts Textract output into `master.md`
`document.extract_discarded`	`/api/documents/discard-extraction`	Patient rejects extracted text
`document.delete`	`/api/documents/[documentId]` DELETE	PHI erasure
`document.export`	`/api/documents/export`	DSGVO Art. 15 response
`document.share`	`/api/documents/share`	PHI leaves sole-patient scope
`document.simplify`	`/api/documents/simplify`	LLM processes PHI
`document.summarize`	`/api/documents/summarize`	LLM processes PHI
`chat.message_with_doc_context`	`/api/chat` when `master.md` injected	PHI fed to LLM

P1 — Admin + auth events

Event	Source	Why
`admin.user_banned`	admin moderation route	Access control action
`admin.content_moderated`	community moderation	Platform enforcement
`admin.role_changed`	any role mutation	Privilege escalation audit
`admin.moderation_analyze`	`/api/moderation/analyze`	OpenAI call on user content
`auth.login_success` / `auth.login_failure`	Supabase Auth webhook or middleware	Access attribution
`auth.password_reset`	Supabase	Account recovery trail

P2 — DSGVO subject rights

Event	Source	Why
`dsgvo.article_15_request`	account data-export endpoint	Right to access
`dsgvo.article_17_request`	account deletion endpoint	Right to erasure
`dsgvo.consent_changed`	privacy settings toggles	Lawful basis record

3. Proposed schema

Append-only table, RLS-enforced. Designed to be extended without schema changes via metadata JSONB.

-- supabase/migrations/2026XXXX_audit_logs.sql

CREATE TABLE audit_logs (
  id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  created_at    TIMESTAMPTZ NOT NULL DEFAULT now(),

  -- WHO
  actor_id      UUID REFERENCES auth.users(id) ON DELETE SET NULL,
  actor_role    TEXT NOT NULL CHECK (actor_role IN ('user','admin','service','anon','system')),

  -- WHAT
  action        TEXT NOT NULL,   -- e.g. 'document.read', 'admin.user_banned'
  resource_type TEXT,            -- e.g. 'document', 'user', 'chat_session'
  resource_id   TEXT,            -- UUID or composite ID as text

  -- WHERE / HOW
  ip_address    INET,
  user_agent    TEXT,

  -- EXTENSIBLE
  metadata      JSONB NOT NULL DEFAULT '{}'::jsonb
);

-- Index for right-to-access queries (Art. 15)
CREATE INDEX idx_audit_logs_actor_created
  ON audit_logs (actor_id, created_at DESC);

-- Index for resource-focused lookups (who touched this document)
CREATE INDEX idx_audit_logs_resource
  ON audit_logs (resource_type, resource_id, created_at DESC);

-- Index for action-type analytics
CREATE INDEX idx_audit_logs_action_created
  ON audit_logs (action, created_at DESC);

-- Append-only: no UPDATE, no DELETE (except by retention job)
REVOKE UPDATE, DELETE ON audit_logs FROM PUBLIC, authenticated, anon;

ALTER TABLE audit_logs ENABLE ROW LEVEL SECURITY;

-- Users see their own records (right to access)
CREATE POLICY "users_select_own"
  ON audit_logs FOR SELECT
  USING (actor_id = auth.uid());

-- Admins see everything
CREATE POLICY "admins_select_all"
  ON audit_logs FOR SELECT
  USING (
    EXISTS (
      SELECT 1 FROM user_roles
      WHERE user_id = auth.uid() AND role = 'admin'
    )
  );

-- Service-role inserts only (via server-side API), never from client
CREATE POLICY "service_insert_only"
  ON audit_logs FOR INSERT
  WITH CHECK (auth.role() = 'service_role');

-- No row-level UPDATE or DELETE policy → nobody can modify history

Design notes:

actor_id nullable + ON DELETE SET NULL so deleting a user (Art. 17) does not cascade-delete their audit trail. Accountability obligation survives erasure of the personal-data subject.
actor_role is denormalized so role changes don't retroactively alter history.
metadata JSONB absorbs future fields (e.g., {"document_line_count": 37, "llm_model": "openai/gpt-4o-mini"}) without migrations.
No FK from resource_id to specific tables — polymorphic. Reads join via resource_type + resource_id.

4. RLS policy matrix

Role	SELECT	INSERT	UPDATE	DELETE
`anon`	—	—	—	—
`authenticated` (self)	✅ own rows	—	—	—
`authenticated` (admin)	✅ all rows	—	—	—
`service_role` (server)	✅ all	✅	—	—
Retention job	✅ all	—	—	✅ (see §6)

The only path that can write is server-side via the service-role key. This prevents spoofing from a compromised client session.

5. Middleware + logger API

Proposed helper in apps/web/lib/audit/logger.ts:

import { createAdminClient } from "@/lib/supabase/admin";

export type AuditAction =
  | `document.${'upload'|'read'|'extract_approved'|'extract_discarded'|'delete'|'export'|'share'|'simplify'|'summarize'}`
  | `chat.message_with_doc_context`
  | `admin.${'user_banned'|'content_moderated'|'role_changed'|'moderation_analyze'}`
  | `auth.${'login_success'|'login_failure'|'password_reset'}`
  | `dsgvo.${'article_15_request'|'article_17_request'|'consent_changed'}`;

export async function recordAudit(event: {
  action: AuditAction;
  actor_id: string | null;
  actor_role: 'user'|'admin'|'service'|'anon'|'system';
  resource_type?: string;
  resource_id?: string;
  ip_address?: string;
  user_agent?: string;
  metadata?: Record<string, unknown>;
}): Promise<void> {
  const admin = createAdminClient();
  const { error } = await admin.from("audit_logs").insert({
    ...event,
    metadata: event.metadata ?? {},
  });
  // Do NOT throw — audit failure must never block the operation, but
  // MUST emit a metric for incident review.
  if (error) console.error("[audit] insert failed", { action: event.action, error });
}

Call sites (implementation order in §7):

Every P0 document route wraps its success path with await recordAudit(...)
/api/chat/route.ts calls it when documentContext is non-empty
Admin routes call it on state-changing branches
Auth events consumed from Supabase webhook (separate endpoint)

6. Retention

Minimum: 1 year (DiGAV interpretation — consult legal before filing). Proposed: 3 years then anonymize (not delete) via pg_cron job.

-- supabase/migrations/2026XXXX_audit_retention.sql
CREATE OR REPLACE FUNCTION anonymize_old_audit_logs() RETURNS void AS $$
BEGIN
  UPDATE audit_logs
     SET actor_id = NULL,
         ip_address = NULL,
         user_agent = NULL,
         metadata = '{"anonymized": true}'::jsonb
   WHERE created_at < NOW() - INTERVAL '3 years'
     AND metadata ->> 'anonymized' IS NULL;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

SELECT cron.schedule('audit-anonymize', '0 3 * * *', 'SELECT anonymize_old_audit_logs();');

Anonymization keeps the row (so aggregate compliance reporting still works) while removing personal identifiers. This is stronger than DiGAV's baseline but aligns with DSGVO data-minimization.

7. Implementation priority

Staged rollout so partial coverage is never a false-sense-of-safety signal.

P0 (week 0 — unblocks DiGA filing)

Migration: audit_logs table + indexes + RLS + policies
lib/audit/logger.ts with typed recordAudit() helper
Wire the 10 apps/web/app/api/documents/* sub-routes
Wire apps/web/app/api/chat/route.ts for chat.message_with_doc_context

P1 (week 1)

Wire admin moderation + ban routes
Wire /api/moderation/analyze
Supabase Auth webhook → auth.* events
Internal docs: list of actions, expected metadata shapes

P2 (week 2)

User-facing "my activity" view (DSGVO Art. 15 self-service)
Admin dashboard (recent events, actor search, resource history)

P3 (week 3)

Anomaly alerts (unusual document-read volume, repeated admin actions)
Weekly retention / anonymization cron verified running
Export pipeline for DSGVO Art. 15 requests includes audit log for that user

P4 (ongoing)

Expand AuditAction union as features ship (every PR that touches PHI adds a new action type)
Post-incident: audit log review becomes a checklist item in postmortems

8. Known design decisions to revisit

No separate hash chain / tamper-evidence. We rely on Postgres WAL + Supabase backups. If DiGA auditor requires a Merkle-style integrity proof, add a prev_hash column and daily hash anchor (would be a non-trivial retrofit — flag early).
Client IP comes from x-forwarded-for. Vercel Edge forwards this correctly, but the value is self-reportable at the TLS boundary. Combined with user_agent + actor_id it's still useful, not forensic.
LLM call metadata. For chat.message_with_doc_context, we store which master.md revision was used, not the full document content. DO NOT log the message body — that would double the PHI footprint.
Anon session logging. Anon chat path (/api/chat/anon) uses a session cookie, not a user. Record as actor_role='anon', actor_id=NULL, metadata.anon_sid=<hash>. Never the raw cookie value.

#	Relevance
#579	Parent docs epic
#586	Pillar B parent
#588 / PR #590	architecture — §6 flags the zero-audit-infra state, this doc is the plan
#391	BfArM consultation — cite this doc as the audit-logging plan
#88	DSGVO cookie / consent gap — `dsgvo.consent_changed` event covers it once landed
#358	Chat Bedrock migration — includes `guardrail_events` logging scope, should align with this schema (either merge tables or keep separate with shared conventions)

Changelog

2026-04-18 — Initial design doc. State at commit 18f5daa: no audit_logs migration in supabase/migrations/. Documents API has 10 sub-routes (updated P0 event list accordingly from quilt §5 which said 6). Schema validated against existing migrations — no table-name collisions.

Audit Logging System

On this page