fragJulia
Changelog

2026-04-25 — R-4 weights provisioning script + Voxtral CC BY-NC policy note

Adds voice/scripts/provision-weights.sh, an idempotent downloader for the three model weights the voice stack needs (Voxtral-4B-TTS-2603, Llama-Guard-3-1B, faster-whisper-large-v3). Closes R-4 #664; partial-closes #521 with the local-deploy CC BY-NC policy note.

What changed

  • voice/scripts/provision-weights.sh — new, ~100-line bash. Downloads three models to /models/<subdir> on the host using the canonical hf CLI (huggingface-cli is deprecated and partially disabled in current huggingface_hub). Idempotent: each model has a sentinel file (params.json for Voxtral, config.json for the other two) — present and non-empty → skip; otherwise hf download <repo> --local-dir <target>. Fails loudly if the sentinel doesn't reappear post-download.
  • voice/scripts/README.md — new. Operator-facing usage doc: how to source voice/.env, how to invoke (subset by key), Voxtral CC BY-NC policy paragraph, secret-hygiene notes (HF_HUB_DISABLE_IMPLICIT_TOKEN=1 + explicit --token, on-disk-token caveat), prerequisite note that PR-B #681 lands the HF_TOKEN= template line in voice/.env.example (existing operators with the token in their .env are unaffected).
  • Voxtral CC BY-NC stance (#521 partial) — local inference via vllm-omni, output served to end-users of the in-product voice agent. Not a public TTS API, not a commercial-API offering. The policy paragraph in the README is the project's stated position.

Why

R-4 (#664) is the original "swap Voxtral weights" issue. The 2026-04-24 bring-up resolved it as a one-shot manual download via hf download on EC2, then cp -rL from the HF cache to /models/voxtral-4b-tts/ (the cp dance was a workaround for a CloudShell paste-wrap bug, not a fundamental requirement). That worked but left no repo-tracked artifact: the next operator on the next box would have to reverse-engineer the procedure from the bring-up handover.

A script captures the procedure as code, idempotent enough to re-run without harm, with a known-good list of model repos. It also factors in the two ergonomic lessons from 2026-04-24:

  1. hf not huggingface-cli — the latter prints a deprecation banner and refuses some commands.
  2. --local-dir directly, no cache-then-copy — avoids both the disk-doubling and the CloudShell paste-wrap class of bug.

Scope

  • voice/scripts/provision-weights.sh (new).
  • voice/scripts/README.md (new).
  • Changelog entry + meta.json.
  • Does NOT touch compose, Dockerfile, or .env* (separate PRs in the bring-up batch).
  • Does NOT delete legacy doc references (voice/DEPLOY-AWS.md, voice/CREDENTIALS-CHECKLIST.md) — that's PR-F (R-12 final closure).

Test plan

  • Static lint: bash -n voice/scripts/provision-weights.sh exits 0.
  • Dry-run on a fresh box (no /models/ populated) with HF_TOKEN env set — expect three downloads, sentinel files appear, exit 0.
  • Re-run on the same box — expect three "already provisioned — skipping" log lines, no network calls, exit 0.
  • Subset run: bash voice/scripts/provision-weights.sh whisper (no token) — exit 0.
  • Subset run: bash voice/scripts/provision-weights.sh voxtral without token — exit 1 with clear "gated repo" error.
  • After provisioning, vllm-voxtral and vllm-guard start cleanly with the new compose (PR-D).
  • docker history voice-voice-agent | grep -iE 'hf_' empty (no token leaked into image, unrelated to this PR but worth confirming on the same EC2 redeploy).

Rollout / reversibility

Reversible — revert removes the script. No infra change. The script does not modify any existing files; it only writes to /models/ (which is host filesystem, not repo).

Follow-ups

  • EC2 retro-application: once this PR + PR-B + PR-C + PR-D have all merged, the EC2 redeploy step in the bring-up plan replaces the prior 2026-04-24 manual hf download shell history with bash voice/scripts/provision-weights.sh. Same outcome, repo-tracked.
  • #521 final closure: the policy paragraph here closes the legal question. PR-F (R-12) cross-references this in voice-deploy.mdx.

On this page