System Architecture
System topology, data flows, auth, voice pipeline, chat pipeline, integration points, PHI/DSGVO/EU residency, and fail-open risks — BfArM/DiGA reference.
Status: Current as of 2026-04-18. Verify against latest code + open epics before BfArM consultation. Audience: BfArM / DiGA reviewers, new contributors, solo-founder reference. Primary driver: #391 (BfArM consultation), regulatory filing.
1. System topology
fragJulia is a German-language breast-cancer patient companion app. All data processing for users is EU-resident.
| Component | Provider / host | Region | Purpose |
|---|---|---|---|
| Web app | Vercel | Frankfurt (fra1) | Next.js 16 App Router, main user surface |
| Mobile app | Expo / EAS | — | React Native client, shares @fragjulia/shared |
| Database + auth + storage | Supabase | EU (Frankfurt) | Postgres + RLS, Supabase Auth, object storage |
| Text-chat LLM (current) | OpenAI API | US | gpt-4o-mini — EU migration planned, see #358 |
| Voice stack | AWS EC2 g6.xlarge | eu-central-1 (Frankfurt) | 24 GB NVIDIA L4, single-box all-in-one |
| Voice LLM | AWS Bedrock — Mistral Large | eu-central-1 | Called from EC2 voice-agent |
| Moderation admin LLM | OpenAI gpt-4o-mini | US | /api/moderation/analyze (admin-only) |
| Payments | Stripe | — | Subscription (Plus 9.90 / Premium 19.90 EUR) |
| Rate-limit / cache | Upstash Redis | EU | Sliding-window rate limits |
| Transactional email | Resend | EU (if configured) | Newsletter + system mail |
| TLS termination (voice) | Caddy on EC2 | eu-central-1 | LE certs for LiveKit WebRTC |
EU residency caveat. The two OpenAI call sites (user-facing text chat + admin moderation) are the only non-EU hops. Migrating text chat to Bedrock Mistral eu-central-1 is tracked in #358; admin moderation is low-traffic and does not touch patient PHI.
2. Monorepo structure
apps/
web/ Next.js 16 (App Router, Tailwind v4)
mobile/ Expo RN client
packages/
shared/ Cross-platform types, chat client helpers
voice/ Python LiveKit agent + docker-compose (EC2 deployment)
supabase/ SQL migrations, config.toml
DesignGUIDE/ Design brief, audit, HTML references
docs/ This folder (runbooks, postmortems, observability, dev/)Package manager: pnpm 9.15.4 workspaces (root CLAUDE.md still says "npm workspaces" — stale, to fix alongside #580).
3. Data flows
3.1 Text chat (current)
User (browser)
→ Vercel Edge
→ POST /api/chat (apps/web/app/api/chat/route.ts, maxDuration=30s)
├── Rate limit (Upstash, 20 msg/60s per IP)
├── Supabase auth (createClient + getUser)
├── master.md inject (Plus/Premium only, checkFeatureAccess + getMasterMdTruncated(userId, 8000))
├── streamText (ai-sdk v6, model="openai/gpt-4o-mini")
└── SSE stream backKey properties:
- System prompt:
JULIA_SYSTEM_PROMPT+ document context - Document context is patient-confirmed only (
master.md), never raw Textract output wrapPatientDocument()wraps injected text so the model treats it as content, not instructionsonFinishhandler is server-side stubbed — persistence is client-side viauseChat.onFinish→persistNewMessages(). Tab-close during stream = messages lost.- Anon chat has a parallel route:
/api/chat/anonwithfj_anon_chat_sidcookie (24h), 10 msg/session, 3 sessions/IP/day.
Planned (epic #358): Swap provider to Bedrock Mistral (eu-central-1) + add Llama Guard 4 input classifier + Prompt Guard 2 + Bedrock Guardrails output + Turnstile + guardrail_events logging. See §7 regulatory context.
3.2 Voice pipeline (BSE — Brustselbstuntersuchung)
User (WebRTC)
→ Caddy TLS (EC2 :443)
→ LiveKit Server (EC2 host, :7880/:7881)
→ voice-agent container
├── faster-whisper large-v3 (INT8, German medical keyterms) [STT]
├── Bedrock Mistral Large (eu-central-1, temp 0.2) [LLM]
├── vLLM / Llama Guard 3 1B (EC2 :8000, output guardrail) [Safety]
└── vLLM / Voxtral TTS 4B (EC2 :8001, cloned Julia voice) [TTS]
→ audio back → LiveKit → UserAll five services share the one NVIDIA L4 (24 GB VRAM) on g6.xlarge:
| Service | VRAM (~) | Notes |
|---|---|---|
| faster-whisper large-v3 INT8 | 4–5 GB | Loaded in voice-agent |
| Llama Guard 3 1B (float16) | ~2 GB | vLLM, gpu-memory-utilization=0.15 |
| Voxtral TTS 4B (bfloat16) | ~11 GB | vLLM, gpu-memory-utilization=0.45 |
| faster-whisper extras + slack | ~6 GB | Leaves headroom for burst |
Topology defined in voice/docker-compose.yml. LiveKit + Caddy + voice-agent use host networking for UDP / WebRTC simplicity. Model weights bind-mounted from /models/ on the EC2 host.
BSE protocol: 7-phase guided self-exam flow. Full spec in docs/voice-bse-test-app-spec.md.
Guardrails (voice): dual layer — static regex set (17 patterns) + Llama Guard 3 1B for M1/M2/M3 category classification on LLM output. Fails open today (known DiGA filing flag — see §6).
3.3 Documents (patient upload → master.md → chat context)
User uploads PDF/image
→ POST /api/documents/upload-url (presigned S3 URL)
→ browser uploads to S3
→ POST /api/documents/confirm-upload (Textract OCR)
→ UI review → POST /api/documents/approve-extraction
→ master.md append (patient-confirmed text only)
→ later injected into chat context via getMasterMdTruncated()Sub-routes under apps/web/app/api/documents/ (10 today):
[documentId]/, approve-extraction/, confirm-upload/, discard-extraction/, download-url/, export/, share/, simplify/, summarize/, upload-url/, plus root route.ts (GET list).
Critical safety property: master.md only ever stores text the patient has approved. Raw Textract output is discarded on rejection. This is the boundary the audit-log scheme in audit-system protects.
3.4 Auth
- Provider: Supabase Auth (email + password)
- Middleware:
apps/web/lib/supabase/middleware.ts(JWT verification + security headers) - Callback:
/auth/callbackhandles OAuth-style redirect + code exchange - Admin: separate server-side paths, service-role key from Vercel env, never exposed to browser
4. Subscription tiers + feature gating
| Tier | Price | Cumulative features |
|---|---|---|
free | — | Base chat, lexikon, limited community |
plus | 9.90 EUR/mo | + document storage, master.md, extended chat history, PDF export |
premium | 19.90 EUR/mo | + advanced features (see apps/web/lib/subscription/types.ts) |
- Client-side gate:
<FeatureGate feature="...">→useSubscription().canAccess() - Server-side enforcement:
checkFeatureAccess(supabase, userId, feature)inapps/web/lib/subscription/server.ts - Stripe price IDs come from
STRIPE_PRICE_PLUS_MONTHLY/STRIPE_PRICE_PREMIUM_MONTHLYenv vars - Schema:
supabase/migrations/20260402_monetization_schema.sql
Full detail in subscription-auth-system.
5. Integration points + failure modes
| Dependency | Failure mode | Current behavior |
|---|---|---|
| Upstash Redis (rate-limiter) | Upstash outage | Fails open — requests proceed without rate limiting |
| Llama Guard 3 1B (voice output guardrail) | Model OOM / crash | Fails open — unsafe LLM output can reach user. DiGA-relevant. |
| Bedrock Mistral Large | Bedrock eu-central-1 outage | Voice session errors out; no graceful TTS fallback yet |
| faster-whisper STT | CUDA OOM | Auto-recovery script restarts container; user sees German error TTS |
| Voxtral TTS | Crash | No text fallback yet (planned) |
| OpenAI gpt-4o-mini (chat) | API outage | 5xx to client, no fallback model |
| Supabase | EU-region outage | Hard failure, no read replica |
| Stripe webhooks | Delivery failure | Stripe retries; manual reconciliation required |
Chat route convergence risk: 4 open epics modify apps/web/app/api/chat/route.ts (#358 Bedrock migration, document upload pipeline, audit logging, memory/context changes). Merge sequencing matters — any two of these landing simultaneously can conflict on the same file. Epic #579 (docs) is a passive reader only, safe to merge alongside.
6. PHI / DSGVO / regulatory posture
What is PHI here
Every user message, document upload, and voice session contains DSGVO Art. 9 health data (breast-cancer medical context).
Residency
- User chats (text): in OpenAI (US) today, EU migration in #358
- User chats (voice audio): never leaves EC2 — STT runs locally, only text reaches Bedrock Mistral (EU)
- Documents: S3 bucket in eu-central-1, Textract in eu-central-1
- Database: Supabase EU
- Bedrock inference: eu-central-1
- Resend, Stripe: data-processor agreements in place (verify DPAs before filing)
Known compliance gaps (surface to BfArM consultation)
- Chat provider US-hosted — #358 migration open; timeline depends on Bedrock Mistral EU access
- Audit infrastructure is zero — no logs of PHI reads, admin actions, or access decisions. See audit-system for full plan. DiGA filing blocker.
- Llama Guard fails open — voice guardrail bypass when guard-service down. BfArM may require fail-closed for crisis / M1 categories.
- DSGVO cookie / consent gaps — tracked in #88
- No data-export / deletion pipeline for chat messages — DSGVO Art. 15 / 17 partially covered for documents, not for chat history
- Anon→auth chat migration missing — anon session messages are not migrated into user account on signup
EU AI Act
Crisis-detection component (when #358 lands) is potentially high-risk under Annex III §5(d) (access to essential services, mental health). Early guardrail work reduces compliance burden ahead of Aug 2027 deadline.
Certificates
- BSI TR-03161 (data security for DiGA) — mandatory since Jan 2025. Verify as part of consultation prep.
7. Environment + configuration
Detailed in configuration-system. Summary:
- Vercel env vars: 30+ grouped by service (Supabase, Stripe, Resend, LiveKit, Upstash, Turnstile, CRON_SECRET, etc.)
- EC2 env:
/root/knotencheck-voice.env+ IAM instance role for Bedrock access - Mobile env:
EXPO_PUBLIC_*subset, injected at build time - Cron:
vercel.jsondefines/api/chat/cleanupat 03:00 daily, guarded byCRON_SECRET
.env.example exists for apps/web/ (48 lines) and voice/ (present as of 2026-04-18). Mobile .env.example still missing.
8. Related epics / issues
| # | Topic | Relevance to architecture |
|---|---|---|
| #358 | Text chat migration to Bedrock Mistral + guardrails | Changes §3.1 end-to-end |
| #337 | Voice AWS migration | Current §3.2 state came from this |
| #88 | DSGVO cookie / consent compliance | §6 gap |
| #391 | BfArM consultation application | This doc's primary audience |
| #502 | (blocked upstream) | — |
| #579 | Documentation epic | Parent of this doc |
| #588 | This doc | — |
| #589 | audit-system.md | Companion doc, §6 gap #2 |
9. References
apps/web/app/api/chat/route.ts— current chat pipeline (OpenAI)apps/web/lib/chat/julia-prompt.ts— system promptapps/web/lib/documents/master-md.ts— patient-confirmed context assemblyapps/web/lib/documents/security-prompt.ts— instruction-injection defense wrapperapps/web/lib/subscription/server.ts— server-side feature gateapps/web/lib/supabase/middleware.ts— auth middlewarevoice/docker-compose.yml— single-box EC2 topology (all 5 services)voice/DEPLOY-AWS.md— deployment runbookdocs/voice-bse-test-app-spec.md— BSE protocol spec (7 phases)supabase/migrations/— schema history (5 migrations as of 2026-04-07)DesignGUIDE/fragjulia-design-brief-v3.md— brand / UX source of truth
Changelog
- 2026-04-18 — Initial version. Verified against repo state at commit
18f5daa. Divergences from prior research (bubbly-bubbling-quilt.md§3, 2-day old): Llama Guard 3 1B is deployed asvllm-guard(not "planned"); Voxtral TTS 4B runs locally on EC2 via vLLM (not external);voice/docker-compose.ymlexists; documents API has 10 sub-routes (not 6); chat provider unchanged (openai/gpt-4o-mini). Zero audit infrastructure confirmed.