Operations
Handover 2026-04-22 — fragJulia voice deploy (v1)
Session handoff from the self-hosted LiveKit/Voxtral bring-up on AWS eu-central-1 (pre-correction). Superseded by v2 on Voxtral weights + runtime; retained verbatim as session ground truth.
Provenance
- Source:
~/OneDrive/Dokumente/Claude/Projects/fragJUlia/HANDOFF-2026-04-22-fragjulia-voice.md(9624 bytes, mtime 2026-04-22 12:46 local)- Ingested: 2026-04-22 as R-0.5 prerequisite for the Voice Deploy Repair epic, before #644 (SSOT-4) OneDrive teardown.
- Status: Superseded by v2 on Voxtral weights & runtime sections. Retained as pre-correction session record — the
Voxtral-Mini-3B-2507claim is corrected in v2 toVoxtral-4B-TTS-2603.- Body is fenced as raw markdown so MDX does not reinterpret
<secret>,{ ... }, or nested-backtick patterns from the source.
# fragJulia Voice — Handoff 2026-04-22
Epic #513 / Sub-epic #507 · Self-hosted AWS, eu-central-1
Continuation of bring-up runbook `HANDOVER-2026-04-23.md`.
---
## Executive Summary
- Success #1 (`curl https://livekit.fragjulia.de/` → HTTP/2 200, valid Let's Encrypt cert) — **GREEN**, verified externally during this session.
- Success #2 (`POST /api/voice/test-token` → LiveKit JWT) — **WIRED, not end-to-end probed.** Vercel env vars (`LIVEKIT_URL`, `LIVEKIT_API_KEY`, `LIVEKIT_API_SECRET`) and EC2 `livekit.yaml` carry matching credentials; final live probe blocked from this session's sandbox egress.
- Success #3 (`/knotencheck` → Start → Julia greets) — **RED, blocked.** vLLM (Voxtral 4B, Llama Guard 3 1B) and faster-whisper-large-v3 services are defined in compose but cannot start: `/opt/models/hf-cache` is empty and the gated weights cannot be pulled without Hugging Face access (Meta + Mistral). Voxtral CC-BY-NC license request is still in flight per prior handover.
- LiveKit + Caddy stack on EC2 is stable; two non-trivial config bugs were found and fixed in this session (UDP/443 conflict, h2c upstream).
- Nora Tschirner audio (`Nora Tschirner_ Lebt Vielfalt _ #VOXStimme.mp4`) parked under task #17 for the Voxtral voice-clone step once the model is on host.
---
## Verified state on EC2 (i-0aeb3778c5078baa1, 3.64.25.163)
| Layer | State | Evidence |
|---|---|---|
| Host (Ubuntu 24.04 DLAMI, NVIDIA L4) | Up | `nvidia-smi` inside `nvidia/cuda:12.4.1-base-ubuntu22.04` returned device info |
| Docker + NVIDIA Container Toolkit | Installed, runtime configured | `docker run --gpus all` succeeded |
| `/opt/models/hf-cache` | Created, owned `ubuntu:ubuntu` | empty — see Blocker #1 |
| Repo `~/fragjulia` | Present | SCPed from local Windows clone of `neid404/fragjulia` |
| `voice/.env` | Present, mode 600 | contains `LIVEKIT_API_KEY`, `LIVEKIT_API_SECRET`; `MISTRAL_API_KEY` empty (local Voxtral path), `DEEPGRAM_API_KEY` empty |
| `voice/config/livekit.yaml` | Rewritten | port 7880, rtc 50000–60000/UDP, tcp 7881, TURN udp 443, region eu-central-1, single keys entry `APIfragjuliaVoice01: <secret>` |
| `voice/config/Caddyfile` | Rewritten (twice) | global `protocols h1 h2`, reverse_proxy localhost:7880, CORS allow-origin `https://fragjulia.de` |
| `voice-livekit-server-1` (livekit/livekit-server:1.11.0, network_mode host) | **Up, healthy** | listening 7880/7881/443UDP/50000-60000UDP |
| `voice-caddy-1` (caddy:2-alpine) | **Up** (badge "unhealthy" cosmetic) | LE cert issued (CN=livekit.fragjulia.de, issuer Let's Encrypt E8); badge fails because internal wget probes `https://localhost:443/healthz` and trips TLS-name verification |
| `voice-vllm-guard-1` | **Down** | Blocker #1 |
| `voice-vllm-voxtral-1` | **Down** | Blocker #1 |
| `voice-voice-agent-1` | **Down** | Blocker #1 (depends on vLLM endpoints + faster-whisper weights) |
---
## External verification (success #1)
```
curl -I https://livekit.fragjulia.de/
HTTP/2 200
server: Caddy
curl https://livekit.fragjulia.de/healthz
OK
```
TLS chain: leaf CN=`livekit.fragjulia.de`, issuer Let's Encrypt E8, `verify return: 0`.
---
## Vercel ↔ LiveKit credential parity (success #2)
The three credentials are identical in three places:
```
LIVEKIT_URL wss://livekit.fragjulia.de
LIVEKIT_API_KEY APIfragjuliaVoice01
LIVEKIT_API_SECRET <32-byte hex; matches handover doc>
```
Locations confirmed during prior sessions:
- Vercel project `fragjulia-web` (`prj_A7vJr0mJg0yUgTEsnxlv4qdtgQDv`), all environments, redeploy `7n48RuvBU`.
- EC2 `~/fragjulia/voice/.env` (mode 600).
- EC2 `~/fragjulia/voice/config/livekit.yaml` (`keys:` map).
`/api/voice/test-token` route contract (read this session from `apps/web/app/api/voice/test-token/route.ts`):
- Method: `POST`
- Body: `{ secret: string, sessionId?: string }`
- Returns `{ url, token, roomName, sessionId }` on 200.
- Returns 401 `Invalid test secret` on bad secret with `KNOTENCHECK_TEST_SECRET` set.
- Returns 403 `Test mode not enabled` if `KNOTENCHECK_TEST_SECRET` unset.
- Returns 503 `Voice service not configured` if `LIVEKIT_URL`/`LIVEKIT_API_KEY`/`LIVEKIT_API_SECRET` missing on Vercel.
End-to-end probe not executed from this session — sandbox egress blocked the `fragjulia.de` host. Two equivalent paths to close success #2 in the next session:
1. From a browser: open `/knotencheck`, click Start, observe the network call to `/api/voice/test-token` returning 200 with a JWT.
2. From EC2 (independent of secret value):
```
curl -sS -o /dev/null -w '%{http_code}\n' \
-X POST -H 'Content-Type: application/json' \
-d '{"secret":"wrong"}' \
https://fragjulia.de/api/voice/test-token
```
- `401` → wiring fully correct, only the real `KNOTENCHECK_TEST_SECRET` is needed for a JWT.
- `403` → regression: secret env var lost from Vercel.
- `503` → regression: LiveKit credentials lost from Vercel.
---
## Fixes landed this session (carry-forward notes)
1. **UDP 443 collision.** Caddy initially refused to start because LiveKit TURN already bound `udp/443`. Fix: global config block forcing HTTP/1.1 + HTTP/2 only, disabling HTTP/3 (which would otherwise grab UDP 443):
```
{
servers {
protocols h1 h2
}
}
```
Do not re-enable HTTP/3 on Caddy unless TURN is moved off 443.
2. **502 to LiveKit upstream.** A `transport http { versions h2c 1.1 }` block on the reverse_proxy made Caddy attempt h2c against an HTTP/1.1-only upstream. Fix: removed the transport block so Caddy uses default HTTP/1.1 to `localhost:7880`.
3. **livekit.yaml rewritten from scratch.** A previous regex-based patch had eaten the `Generate keys:` comment and risked corrupting the keys map. Authoritative file is now the freshly written one on EC2 — do not regex-patch it; edit-in-place if anything changes.
4. **Caddy "unhealthy" badge is cosmetic.** Internal healthcheck hits `https://localhost:443/healthz` and fails TLS name verification (cert is for the public hostname, not localhost). External `/healthz` is 200. Ignore the docker badge until/unless we add a hostname-aware healthcheck.
5. **PEM/SSH plumbing on Windows.** Git-for-Windows `ssh.exe`/`scp.exe` via `.bat` launchers is the working path; Windows OpenSSH eats stdio under the MCP. Reuse `C:\Users\dapar\AppData\Local\Temp\sshtry.bat` and `scptry.bat`. Always LF-convert any shell scripts before SCP — bare CRLF triggers `set: -\r: invalid option` on the EC2 side.
---
## Open blockers
### Blocker #1 — model weights for the GPU services (success #3 gate)
`/opt/models/hf-cache` is empty. The three GPU services pull on first launch into this directory and need:
- `mistralai/Voxtral-Mini-3B-2507` (TTS) — CC-BY-NC; the license request flagged in the prior handover is the gating event. Until granted, this service cannot start.
- `meta-llama/Llama-Guard-3-1B` (safety filter) — Meta-gated on Hugging Face; needs HF account with accepted license.
- `Systran/faster-whisper-large-v3` (STT) — public; will pull once an HF token is configured for the cache.
Required to unblock:
- Hugging Face token with: Meta gated-model access accepted for Llama Guard 3, Mistral gated-model access accepted for Voxtral.
- Token written to `~/.cache/huggingface/token` (or HF_TOKEN env in compose) on EC2 before `docker compose up -d` for the GPU services.
- First boot will pull ~10–20 GB into `/opt/models/hf-cache`. EBS volume free space should be checked beforehand.
Action when license/access lands:
```
ssh ubuntu@3.64.25.163
huggingface-cli login # paste HF_TOKEN
cd ~/fragjulia/voice
docker compose pull vllm-guard vllm-voxtral voice-agent
docker compose up -d vllm-guard vllm-voxtral voice-agent
docker compose logs -f voice-agent # wait for "ready"
```
### Blocker #2 — `KNOTENCHECK_TEST_SECRET` value not carried in handover
Per prior handover: env var **is** set on Vercel (set 2026-04-15, redeploy 7n48RuvBU). The value itself is not in any handover doc; only Vercel holds it. Either retrieve from Vercel UI for an authenticated probe, or rely on the `/knotencheck` browser path which uses the public secret embedded as `NEXT_PUBLIC_KNOTENCHECK_TEST_SECRET`.
---
## Outstanding work
| ID | State | Owner | Note |
|---|---|---|---|
| #16 Provision /models | pending | — | Blocker #1 |
| #17 Process Nora Tschirner audio for Voxtral voice clone | pending | — | Parked until Voxtral model on host. Source file: `Nora Tschirner_ Lebt Vielfalt _ #VOXStimme.mp4`. Workflow not yet defined; presumed: extract clean speech segments, transcode to 16/24 kHz mono WAV, register as `julia_knotencheck` voice prompt against Voxtral's voice-conditioning interface. |
---
## Resume path for the next session
1. Confirm Voxtral CC-BY-NC license status; confirm Meta Llama Guard 3 access on the HF account being used.
2. SSH to EC2, write HF token, `docker compose up -d vllm-guard vllm-voxtral voice-agent`, watch logs.
3. From EC2: probe `/api/voice/test-token` (see curl above) to close success #2.
4. From browser: `/knotencheck` → Start → confirm Julia's greeting (success #3).
5. Once #3 is GREEN: process Nora Tschirner audio per task #17.
Rollback path is unchanged from prior handover (revert Vercel env to pre-self-host LiveKit Cloud values; redeploy previous production deployment; DNS A record can stay).
---
Adherence: ScopeCard=Yes | Mode=Deep | Browse=No (sandbox egress blocked, infra state read from prior session record + on-host state) | Uncertainty=Stated (success #2 not end-to-end probed; success #3 blocked on external license/access) | Eckpfeiler=Not triggeredHandover 2026-04-22 — SSOT consolidation
Context, discoveries, delivered scaffolding, and open work for the SSOT consolidation effort. Written for the next session or contributor to pick up without prior context.
Handover 2026-04-22 — fragJulia voice deploy (v2, post-correction)
Post-correction session handoff anchored to the fragJulia Voice Infra Spec v2 self-hosted PDF. Corrects v1 on Voxtral variant (4B-TTS-2603) and runtime (vllm-omni). Secrets redacted during ingestion.