fragJulia
Dev

System Architecture

System topology, data flows, auth, voice pipeline, chat pipeline, integration points, PHI/DSGVO/EU residency, and fail-open risks — BfArM/DiGA reference.

Status: Current as of 2026-04-18. Verify against latest code + open epics before BfArM consultation. Audience: BfArM / DiGA reviewers, new contributors, solo-founder reference. Primary driver: #391 (BfArM consultation), regulatory filing.


1. System topology

fragJulia is a German-language breast-cancer patient companion app. All data processing for users is EU-resident.

ComponentProvider / hostRegionPurpose
Web appVercelFrankfurt (fra1)Next.js 16 App Router, main user surface
Mobile appExpo / EASReact Native client, shares @fragjulia/shared
Database + auth + storageSupabaseEU (Frankfurt)Postgres + RLS, Supabase Auth, object storage
Text-chat LLM (current)OpenAI APIUSgpt-4o-miniEU migration planned, see #358
Voice stackAWS EC2 g6.xlargeeu-central-1 (Frankfurt)24 GB NVIDIA L4, single-box all-in-one
Voice LLMAWS Bedrock — Mistral Largeeu-central-1Called from EC2 voice-agent
Moderation admin LLMOpenAI gpt-4o-miniUS/api/moderation/analyze (admin-only)
PaymentsStripeSubscription (Plus 9.90 / Premium 19.90 EUR)
Rate-limit / cacheUpstash RedisEUSliding-window rate limits
Transactional emailResendEU (if configured)Newsletter + system mail
TLS termination (voice)Caddy on EC2eu-central-1LE certs for LiveKit WebRTC

EU residency caveat. The two OpenAI call sites (user-facing text chat + admin moderation) are the only non-EU hops. Migrating text chat to Bedrock Mistral eu-central-1 is tracked in #358; admin moderation is low-traffic and does not touch patient PHI.


2. Monorepo structure

apps/
  web/           Next.js 16 (App Router, Tailwind v4)
  mobile/        Expo RN client
packages/
  shared/        Cross-platform types, chat client helpers
voice/           Python LiveKit agent + docker-compose (EC2 deployment)
supabase/        SQL migrations, config.toml
DesignGUIDE/     Design brief, audit, HTML references
docs/            This folder (runbooks, postmortems, observability, dev/)

Package manager: pnpm 9.15.4 workspaces (root CLAUDE.md still says "npm workspaces" — stale, to fix alongside #580).


3. Data flows

3.1 Text chat (current)

User (browser)
  → Vercel Edge
  → POST /api/chat        (apps/web/app/api/chat/route.ts, maxDuration=30s)
    ├── Rate limit        (Upstash, 20 msg/60s per IP)
    ├── Supabase auth     (createClient + getUser)
    ├── master.md inject  (Plus/Premium only, checkFeatureAccess + getMasterMdTruncated(userId, 8000))
    ├── streamText        (ai-sdk v6, model="openai/gpt-4o-mini")
    └── SSE stream back

Key properties:

  • System prompt: JULIA_SYSTEM_PROMPT + document context
  • Document context is patient-confirmed only (master.md), never raw Textract output
  • wrapPatientDocument() wraps injected text so the model treats it as content, not instructions
  • onFinish handler is server-side stubbed — persistence is client-side via useChat.onFinishpersistNewMessages(). Tab-close during stream = messages lost.
  • Anon chat has a parallel route: /api/chat/anon with fj_anon_chat_sid cookie (24h), 10 msg/session, 3 sessions/IP/day.

Planned (epic #358): Swap provider to Bedrock Mistral (eu-central-1) + add Llama Guard 4 input classifier + Prompt Guard 2 + Bedrock Guardrails output + Turnstile + guardrail_events logging. See §7 regulatory context.

3.2 Voice pipeline (BSE — Brustselbstuntersuchung)

User (WebRTC)
  → Caddy TLS (EC2 :443)
  → LiveKit Server (EC2 host, :7880/:7881)
  → voice-agent container
    ├── faster-whisper large-v3 (INT8, German medical keyterms)   [STT]
    ├── Bedrock Mistral Large    (eu-central-1, temp 0.2)          [LLM]
    ├── vLLM / Llama Guard 3 1B  (EC2 :8000, output guardrail)     [Safety]
    └── vLLM / Voxtral TTS 4B    (EC2 :8001, cloned Julia voice)   [TTS]
  → audio back → LiveKit → User

All five services share the one NVIDIA L4 (24 GB VRAM) on g6.xlarge:

ServiceVRAM (~)Notes
faster-whisper large-v3 INT84–5 GBLoaded in voice-agent
Llama Guard 3 1B (float16)~2 GBvLLM, gpu-memory-utilization=0.15
Voxtral TTS 4B (bfloat16)~11 GBvLLM, gpu-memory-utilization=0.45
faster-whisper extras + slack~6 GBLeaves headroom for burst

Topology defined in voice/docker-compose.yml. LiveKit + Caddy + voice-agent use host networking for UDP / WebRTC simplicity. Model weights bind-mounted from /models/ on the EC2 host.

BSE protocol: 7-phase guided self-exam flow. Full spec in docs/voice-bse-test-app-spec.md.

Guardrails (voice): dual layer — static regex set (17 patterns) + Llama Guard 3 1B for M1/M2/M3 category classification on LLM output. Fails open today (known DiGA filing flag — see §6).

3.3 Documents (patient upload → master.md → chat context)

User uploads PDF/image
  → POST /api/documents/upload-url         (presigned S3 URL)
  → browser uploads to S3
  → POST /api/documents/confirm-upload     (Textract OCR)
  → UI review → POST /api/documents/approve-extraction
  → master.md append  (patient-confirmed text only)
  → later injected into chat context via getMasterMdTruncated()

Sub-routes under apps/web/app/api/documents/ (10 today): [documentId]/, approve-extraction/, confirm-upload/, discard-extraction/, download-url/, export/, share/, simplify/, summarize/, upload-url/, plus root route.ts (GET list).

Critical safety property: master.md only ever stores text the patient has approved. Raw Textract output is discarded on rejection. This is the boundary the audit-log scheme in audit-system protects.

3.4 Auth

  • Provider: Supabase Auth (email + password)
  • Middleware: apps/web/lib/supabase/middleware.ts (JWT verification + security headers)
  • Callback: /auth/callback handles OAuth-style redirect + code exchange
  • Admin: separate server-side paths, service-role key from Vercel env, never exposed to browser

4. Subscription tiers + feature gating

TierPriceCumulative features
freeBase chat, lexikon, limited community
plus9.90 EUR/mo+ document storage, master.md, extended chat history, PDF export
premium19.90 EUR/mo+ advanced features (see apps/web/lib/subscription/types.ts)
  • Client-side gate: <FeatureGate feature="...">useSubscription().canAccess()
  • Server-side enforcement: checkFeatureAccess(supabase, userId, feature) in apps/web/lib/subscription/server.ts
  • Stripe price IDs come from STRIPE_PRICE_PLUS_MONTHLY / STRIPE_PRICE_PREMIUM_MONTHLY env vars
  • Schema: supabase/migrations/20260402_monetization_schema.sql

Full detail in subscription-auth-system.


5. Integration points + failure modes

DependencyFailure modeCurrent behavior
Upstash Redis (rate-limiter)Upstash outageFails open — requests proceed without rate limiting
Llama Guard 3 1B (voice output guardrail)Model OOM / crashFails open — unsafe LLM output can reach user. DiGA-relevant.
Bedrock Mistral LargeBedrock eu-central-1 outageVoice session errors out; no graceful TTS fallback yet
faster-whisper STTCUDA OOMAuto-recovery script restarts container; user sees German error TTS
Voxtral TTSCrashNo text fallback yet (planned)
OpenAI gpt-4o-mini (chat)API outage5xx to client, no fallback model
SupabaseEU-region outageHard failure, no read replica
Stripe webhooksDelivery failureStripe retries; manual reconciliation required

Chat route convergence risk: 4 open epics modify apps/web/app/api/chat/route.ts (#358 Bedrock migration, document upload pipeline, audit logging, memory/context changes). Merge sequencing matters — any two of these landing simultaneously can conflict on the same file. Epic #579 (docs) is a passive reader only, safe to merge alongside.


6. PHI / DSGVO / regulatory posture

What is PHI here

Every user message, document upload, and voice session contains DSGVO Art. 9 health data (breast-cancer medical context).

Residency

  • User chats (text): in OpenAI (US) today, EU migration in #358
  • User chats (voice audio): never leaves EC2 — STT runs locally, only text reaches Bedrock Mistral (EU)
  • Documents: S3 bucket in eu-central-1, Textract in eu-central-1
  • Database: Supabase EU
  • Bedrock inference: eu-central-1
  • Resend, Stripe: data-processor agreements in place (verify DPAs before filing)

Known compliance gaps (surface to BfArM consultation)

  1. Chat provider US-hosted — #358 migration open; timeline depends on Bedrock Mistral EU access
  2. Audit infrastructure is zero — no logs of PHI reads, admin actions, or access decisions. See audit-system for full plan. DiGA filing blocker.
  3. Llama Guard fails open — voice guardrail bypass when guard-service down. BfArM may require fail-closed for crisis / M1 categories.
  4. DSGVO cookie / consent gaps — tracked in #88
  5. No data-export / deletion pipeline for chat messages — DSGVO Art. 15 / 17 partially covered for documents, not for chat history
  6. Anon→auth chat migration missing — anon session messages are not migrated into user account on signup

EU AI Act

Crisis-detection component (when #358 lands) is potentially high-risk under Annex III §5(d) (access to essential services, mental health). Early guardrail work reduces compliance burden ahead of Aug 2027 deadline.

Certificates

  • BSI TR-03161 (data security for DiGA) — mandatory since Jan 2025. Verify as part of consultation prep.

7. Environment + configuration

Detailed in configuration-system. Summary:

  • Vercel env vars: 30+ grouped by service (Supabase, Stripe, Resend, LiveKit, Upstash, Turnstile, CRON_SECRET, etc.)
  • EC2 env: /root/knotencheck-voice.env + IAM instance role for Bedrock access
  • Mobile env: EXPO_PUBLIC_* subset, injected at build time
  • Cron: vercel.json defines /api/chat/cleanup at 03:00 daily, guarded by CRON_SECRET

.env.example exists for apps/web/ (48 lines) and voice/ (present as of 2026-04-18). Mobile .env.example still missing.


#TopicRelevance to architecture
#358Text chat migration to Bedrock Mistral + guardrailsChanges §3.1 end-to-end
#337Voice AWS migrationCurrent §3.2 state came from this
#88DSGVO cookie / consent compliance§6 gap
#391BfArM consultation applicationThis doc's primary audience
#502(blocked upstream)
#579Documentation epicParent of this doc
#588This doc
#589audit-system.mdCompanion doc, §6 gap #2

9. References

  • apps/web/app/api/chat/route.ts — current chat pipeline (OpenAI)
  • apps/web/lib/chat/julia-prompt.ts — system prompt
  • apps/web/lib/documents/master-md.ts — patient-confirmed context assembly
  • apps/web/lib/documents/security-prompt.ts — instruction-injection defense wrapper
  • apps/web/lib/subscription/server.ts — server-side feature gate
  • apps/web/lib/supabase/middleware.ts — auth middleware
  • voice/docker-compose.yml — single-box EC2 topology (all 5 services)
  • voice/DEPLOY-AWS.md — deployment runbook
  • docs/voice-bse-test-app-spec.md — BSE protocol spec (7 phases)
  • supabase/migrations/ — schema history (5 migrations as of 2026-04-07)
  • DesignGUIDE/fragjulia-design-brief-v3.md — brand / UX source of truth

Changelog

  • 2026-04-18 — Initial version. Verified against repo state at commit 18f5daa. Divergences from prior research (bubbly-bubbling-quilt.md §3, 2-day old): Llama Guard 3 1B is deployed as vllm-guard (not "planned"); Voxtral TTS 4B runs locally on EC2 via vLLM (not external); voice/docker-compose.yml exists; documents API has 10 sub-routes (not 6); chat provider unchanged (openai/gpt-4o-mini). Zero audit infrastructure confirmed.

On this page