fragJulia
Dev

Audit Logging System

DiGAV §4(3) / DSGVO Art. 9 / Art. 5(2) design doc — audit_logs schema, RLS policies, retention, implementation priority. Zero audit infra exists today.

Status: Design doc. Zero audit infrastructure exists in production today. Current as of 2026-04-18. Audience: DiGA reviewer, BfArM consultation, implementation engineer. Primary driver: DiGA filing blocker — audit logging is mandatory under DiGAV §4(3) and DSGVO accountability requirements.


1. Why this matters

Regulatory obligations

RegulationRequirement
DiGAV §4(3)All access to patient health data must be logged and attributable to an actor
DSGVO Art. 9Health data is a special category — processing requires provable lawful basis + traceability
DSGVO Art. 5(2)Controller must be able to demonstrate compliance (accountability principle)
DSGVO Art. 15Data subject has right to know what data is processed about them — requires access log
DSGVO Art. 17Right to erasure — erasure events themselves must be recorded
BSI TR-03161Data security certificate mandatory for DiGA since Jan 2025 — requires auditability

Current state (verified against repo at 2026-04-18)

  • No audit_logs table. supabase/migrations/ contains 5 migrations (latest 20260407_community_post_limit_2.sql) — none create an audit schema.
  • No logging middleware. apps/web/lib/supabase/middleware.ts handles auth + security headers only.
  • Admin actions are silent. Ban / moderation / role-change routes have no audit trail.
  • Chat with document context is silent. /api/chat/route.ts injects master.md into LLM context (Plus/Premium gate via checkFeatureAccess) with no record of which document lines were read.
  • Document CRUD is silent. All 10 sub-routes under apps/web/app/api/documents/* operate through RLS but log nothing centrally.

This is the single largest DiGA filing gap.


2. Events that MUST be logged

Organized by regulatory priority:

P0 — PHI access (DiGAV §4(3))

EventSourceWhy
document.upload/api/documents/upload-url + confirm-uploadPatient uploads PHI
document.read/api/documents/[documentId] GET + download-urlPHI read
document.extract_approved/api/documents/approve-extractionPatient accepts Textract output into master.md
document.extract_discarded/api/documents/discard-extractionPatient rejects extracted text
document.delete/api/documents/[documentId] DELETEPHI erasure
document.export/api/documents/exportDSGVO Art. 15 response
document.share/api/documents/sharePHI leaves sole-patient scope
document.simplify/api/documents/simplifyLLM processes PHI
document.summarize/api/documents/summarizeLLM processes PHI
chat.message_with_doc_context/api/chat when master.md injectedPHI fed to LLM

P1 — Admin + auth events

EventSourceWhy
admin.user_bannedadmin moderation routeAccess control action
admin.content_moderatedcommunity moderationPlatform enforcement
admin.role_changedany role mutationPrivilege escalation audit
admin.moderation_analyze/api/moderation/analyzeOpenAI call on user content
auth.login_success / auth.login_failureSupabase Auth webhook or middlewareAccess attribution
auth.password_resetSupabaseAccount recovery trail

P2 — DSGVO subject rights

EventSourceWhy
dsgvo.article_15_requestaccount data-export endpointRight to access
dsgvo.article_17_requestaccount deletion endpointRight to erasure
dsgvo.consent_changedprivacy settings togglesLawful basis record

3. Proposed schema

Append-only table, RLS-enforced. Designed to be extended without schema changes via metadata JSONB.

-- supabase/migrations/2026XXXX_audit_logs.sql

CREATE TABLE audit_logs (
  id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  created_at    TIMESTAMPTZ NOT NULL DEFAULT now(),

  -- WHO
  actor_id      UUID REFERENCES auth.users(id) ON DELETE SET NULL,
  actor_role    TEXT NOT NULL CHECK (actor_role IN ('user','admin','service','anon','system')),

  -- WHAT
  action        TEXT NOT NULL,   -- e.g. 'document.read', 'admin.user_banned'
  resource_type TEXT,            -- e.g. 'document', 'user', 'chat_session'
  resource_id   TEXT,            -- UUID or composite ID as text

  -- WHERE / HOW
  ip_address    INET,
  user_agent    TEXT,

  -- EXTENSIBLE
  metadata      JSONB NOT NULL DEFAULT '{}'::jsonb
);

-- Index for right-to-access queries (Art. 15)
CREATE INDEX idx_audit_logs_actor_created
  ON audit_logs (actor_id, created_at DESC);

-- Index for resource-focused lookups (who touched this document)
CREATE INDEX idx_audit_logs_resource
  ON audit_logs (resource_type, resource_id, created_at DESC);

-- Index for action-type analytics
CREATE INDEX idx_audit_logs_action_created
  ON audit_logs (action, created_at DESC);

-- Append-only: no UPDATE, no DELETE (except by retention job)
REVOKE UPDATE, DELETE ON audit_logs FROM PUBLIC, authenticated, anon;

ALTER TABLE audit_logs ENABLE ROW LEVEL SECURITY;

-- Users see their own records (right to access)
CREATE POLICY "users_select_own"
  ON audit_logs FOR SELECT
  USING (actor_id = auth.uid());

-- Admins see everything
CREATE POLICY "admins_select_all"
  ON audit_logs FOR SELECT
  USING (
    EXISTS (
      SELECT 1 FROM user_roles
      WHERE user_id = auth.uid() AND role = 'admin'
    )
  );

-- Service-role inserts only (via server-side API), never from client
CREATE POLICY "service_insert_only"
  ON audit_logs FOR INSERT
  WITH CHECK (auth.role() = 'service_role');

-- No row-level UPDATE or DELETE policy → nobody can modify history

Design notes:

  • actor_id nullable + ON DELETE SET NULL so deleting a user (Art. 17) does not cascade-delete their audit trail. Accountability obligation survives erasure of the personal-data subject.
  • actor_role is denormalized so role changes don't retroactively alter history.
  • metadata JSONB absorbs future fields (e.g., {"document_line_count": 37, "llm_model": "openai/gpt-4o-mini"}) without migrations.
  • No FK from resource_id to specific tables — polymorphic. Reads join via resource_type + resource_id.

4. RLS policy matrix

RoleSELECTINSERTUPDATEDELETE
anon
authenticated (self)✅ own rows
authenticated (admin)✅ all rows
service_role (server)✅ all
Retention job✅ all✅ (see §6)

The only path that can write is server-side via the service-role key. This prevents spoofing from a compromised client session.


5. Middleware + logger API

Proposed helper in apps/web/lib/audit/logger.ts:

import { createAdminClient } from "@/lib/supabase/admin";

export type AuditAction =
  | `document.${'upload'|'read'|'extract_approved'|'extract_discarded'|'delete'|'export'|'share'|'simplify'|'summarize'}`
  | `chat.message_with_doc_context`
  | `admin.${'user_banned'|'content_moderated'|'role_changed'|'moderation_analyze'}`
  | `auth.${'login_success'|'login_failure'|'password_reset'}`
  | `dsgvo.${'article_15_request'|'article_17_request'|'consent_changed'}`;

export async function recordAudit(event: {
  action: AuditAction;
  actor_id: string | null;
  actor_role: 'user'|'admin'|'service'|'anon'|'system';
  resource_type?: string;
  resource_id?: string;
  ip_address?: string;
  user_agent?: string;
  metadata?: Record<string, unknown>;
}): Promise<void> {
  const admin = createAdminClient();
  const { error } = await admin.from("audit_logs").insert({
    ...event,
    metadata: event.metadata ?? {},
  });
  // Do NOT throw — audit failure must never block the operation, but
  // MUST emit a metric for incident review.
  if (error) console.error("[audit] insert failed", { action: event.action, error });
}

Call sites (implementation order in §7):

  • Every P0 document route wraps its success path with await recordAudit(...)
  • /api/chat/route.ts calls it when documentContext is non-empty
  • Admin routes call it on state-changing branches
  • Auth events consumed from Supabase webhook (separate endpoint)

6. Retention

Minimum: 1 year (DiGAV interpretation — consult legal before filing). Proposed: 3 years then anonymize (not delete) via pg_cron job.

-- supabase/migrations/2026XXXX_audit_retention.sql
CREATE OR REPLACE FUNCTION anonymize_old_audit_logs() RETURNS void AS $$
BEGIN
  UPDATE audit_logs
     SET actor_id = NULL,
         ip_address = NULL,
         user_agent = NULL,
         metadata = '{"anonymized": true}'::jsonb
   WHERE created_at < NOW() - INTERVAL '3 years'
     AND metadata ->> 'anonymized' IS NULL;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

SELECT cron.schedule('audit-anonymize', '0 3 * * *', 'SELECT anonymize_old_audit_logs();');

Anonymization keeps the row (so aggregate compliance reporting still works) while removing personal identifiers. This is stronger than DiGAV's baseline but aligns with DSGVO data-minimization.


7. Implementation priority

Staged rollout so partial coverage is never a false-sense-of-safety signal.

P0 (week 0 — unblocks DiGA filing)

  • Migration: audit_logs table + indexes + RLS + policies
  • lib/audit/logger.ts with typed recordAudit() helper
  • Wire the 10 apps/web/app/api/documents/* sub-routes
  • Wire apps/web/app/api/chat/route.ts for chat.message_with_doc_context

P1 (week 1)

  • Wire admin moderation + ban routes
  • Wire /api/moderation/analyze
  • Supabase Auth webhook → auth.* events
  • Internal docs: list of actions, expected metadata shapes

P2 (week 2)

  • User-facing "my activity" view (DSGVO Art. 15 self-service)
  • Admin dashboard (recent events, actor search, resource history)

P3 (week 3)

  • Anomaly alerts (unusual document-read volume, repeated admin actions)
  • Weekly retention / anonymization cron verified running
  • Export pipeline for DSGVO Art. 15 requests includes audit log for that user

P4 (ongoing)

  • Expand AuditAction union as features ship (every PR that touches PHI adds a new action type)
  • Post-incident: audit log review becomes a checklist item in postmortems

8. Known design decisions to revisit

  • No separate hash chain / tamper-evidence. We rely on Postgres WAL + Supabase backups. If DiGA auditor requires a Merkle-style integrity proof, add a prev_hash column and daily hash anchor (would be a non-trivial retrofit — flag early).
  • Client IP comes from x-forwarded-for. Vercel Edge forwards this correctly, but the value is self-reportable at the TLS boundary. Combined with user_agent + actor_id it's still useful, not forensic.
  • LLM call metadata. For chat.message_with_doc_context, we store which master.md revision was used, not the full document content. DO NOT log the message body — that would double the PHI footprint.
  • Anon session logging. Anon chat path (/api/chat/anon) uses a session cookie, not a user. Record as actor_role='anon', actor_id=NULL, metadata.anon_sid=<hash>. Never the raw cookie value.

#Relevance
#579Parent docs epic
#586Pillar B parent
#588 / PR #590architecture — §6 flags the zero-audit-infra state, this doc is the plan
#391BfArM consultation — cite this doc as the audit-logging plan
#88DSGVO cookie / consent gap — dsgvo.consent_changed event covers it once landed
#358Chat Bedrock migration — includes guardrail_events logging scope, should align with this schema (either merge tables or keep separate with shared conventions)

Changelog

  • 2026-04-18 — Initial design doc. State at commit 18f5daa: no audit_logs migration in supabase/migrations/. Documents API has 10 sub-routes (updated P0 event list accordingly from quilt §5 which said 6). Schema validated against existing migrations — no table-name collisions.

On this page