fragJulia
Changelog

2026-04-25 — R-9 voice-agent Dockerfile fix + turn-detector image-bake

Fixes the voice-agent multi-stage build (deadsnakes PPA + ensurepip bootstrap + posix_local COPY path) and bakes the LiveKit turn-detector ONNX model into the image so docker compose --force-recreate no longer loses the runtime download. Closes R-9 #669, #527, RT-2 #674.

What changed

voice/agent/Dockerfile — three corrections in both builder and runtime stages, all required to build cleanly on a fresh host:

  1. deadsnakes PPApython3.12 is not in Ubuntu 22.04's default repos, but the CUDA base images are pinned to ubuntu22.04 (no nvidia/cuda:12.4.1-*-ubuntu24.04 variant exists on Docker Hub). The PPA is added before every apt-get install python3.12* call.
  2. python3.12 -m ensurepip --upgrade — Python 3.12 removed distutils; the system pip shipped via python3-pip (intended for 3.10) fails with ModuleNotFoundError: No module named 'distutils' when called as python -m pip install --prefix=/install. ensurepip --upgrade bootstraps a Python-3.12-native pip that doesn't depend on distutils.
  3. COPY --from=builder /install/local /usr/local (was /install /usr/local) — pip install --prefix=/install on Debian-derived Ubuntu uses the posix_local install scheme, which writes to /install/local/lib/python3.12/dist-packages/, not /install/lib/.... Copying the whole /install tree ended up putting packages at /usr/local/local/lib/python3.12/dist-packages/ — outside Python's default sys.path. The agent then failed with ModuleNotFoundError for every dependency at runtime, and the workaround was a compose-level PYTHONPATH=/usr/local/local/lib/python3.12/dist-packages override. Copying /install/local directly into /usr/local lands packages at the canonical /usr/local/lib/python3.12/dist-packages/ path that sys.path already includes — workaround eliminated, override is removed in PR-D.

Turn-detector image-bake (RT-2 #674) — added RUN python main.py download-files after COPY . . in the runtime stage. python main.py download-files is the LiveKit Agents CLI hook that fetches every plugin's runtime asset; for this stack that's livekit-plugins-turn-detector's model_q8.onnx (~50 MB). Baking at build time means a docker compose up -d --force-recreate voice-agent does NOT trigger a fresh download into the container-ephemeral HF cache — first-boot reliability without the per-recreate flake.

EXPOSE 8081 (was 8080) — the livekit-agents framework binds its worker health server on :8081 by default. The previous EXPOSE 8080 was documentation-incorrect; the matching compose-side healthcheck port fix lands in PR-D as part of R-8 #668.

Why

R-9 (#669) of the voice deploy repair epic (#660) called for un-baking model weights from the agent image. Empirically, faster-whisper-large-v3 was already not baked — it's mounted via /models:/models:ro. The original #527 concern was already resolved at image-level, so R-9's actual remaining scope shifted to the related lifecycle question raised by RT-2 (#674): turn-detector downloads its model into a container-ephemeral HF cache at runtime, and every --force-recreate loses it.

The decision was framed in #672 §5 as "image-bake vs host bind-mount vs per-plugin lifecycle hook." Image-bake wins for this stack because (a) the model is small (~50 MB), (b) it's a single asset, not a churning set, (c) bind-mounting host paths into the container adds an EC2-side state dependency that violates SSOT, and (d) baking removes the runtime HF download path entirely, which means once R-3 #663 / PR-B's HF_TOKEN plumbing lands, the running agent doesn't need HF_TOKEN at runtime at all (provisioning-only).

The Dockerfile build fixes were tactical hand-edits on EC2 during the 2026-04-24 bring-up (.bak through .bak9 trail on the host) but never made it into the repo. Without them, docker compose build voice-agent from a clean checkout fails. This PR makes the canonical Dockerfile correct again.

Scope

  • voice/agent/Dockerfile only. No changes to compose, requirements.txt, or agent code.
  • Changelog entry + meta.json.

Test plan

  • CI: docker build voice/agent/ succeeds against nvidia/cuda:12.4.1-devel-ubuntu22.04 + :runtime-ubuntu22.04 bases.
  • Verify docker run --rm voice-voice-agent ls /usr/local/lib/python3.12/dist-packages/livekit/plugins/turn_detector shows model_q8.onnx. No /usr/local/local paths anywhere in find / -name livekit -type d.
  • docker run --rm voice-voice-agent python -c "from livekit.plugins.turn_detector.multilingual import MultilingualModel; MultilingualModel(); print('ok')" exits 0 without network.
  • docker history voice-voice-agent | grep -iE 'hf_|HF_TOKEN' returns empty (no token baked).

Follow-ups

  • PR-D (R-5 + R-6 + R-8 compose canonicalization) — drops the PYTHONPATH=/usr/local/local/... workaround from voice/docker-compose.yml once this lands, and matches the healthcheck port to :8081.
  • Closes RT-2 #674 with the decision recorded above. RT-1 #673 (97% VRAM ceiling) remains deferred per operator.

On this page