2026-04-25 — voice plugins coherent upgrade to livekit-agents 1.5.6 (epic #700)
Description
Coherent rewrite of the two voice-agent custom plugins still carrying livekit-agents 1.5+ API drift after the bring-up's salami-fix series (PR-I/J/K/L = #692/#693/#694/#696). R-10 Probe 2 (#670) on 2026-04-25 caught the two confirmed-broken layers below; this PR closes the epic #700 plugin-side scope.
The two confirmed-broken errors from R-10 Probe 2:
File "/app/custom_plugins/bedrock_mistral_llm.py", line 95, in chat
"name": t.name,
AttributeError: 'FunctionTool' object has no attribute 'name'
File "/app/custom_plugins/voxtral_tts.py", line 206, in __init__ ...
TypeError: VoxtralSynthesizeStream._run() takes 1 positional argument but 2 were givenPlus a third silent bug discovered while rewriting _build_messages: in 1.5.6 ChatContext.messages() returns only ChatMessage items, so function-call and function-call-output items (the tool round-trip history) were being silently dropped — every multi-turn session with a tool invocation lost context after the first turn.
What changed
voice/agent/custom_plugins/bedrock_mistral_llm.py
Two production-code fixes plus one refactor:
_build_tools(tools)(new helper extracted fromchat()): dispatches byisinstance(t, llm.RawFunctionTool)vsisinstance(t, llm.FunctionTool), mirroring the canonical pattern inlivekit.agents.llm._provider_format.openai.to_fnc_ctx. ForFunctionTool, hands off tollm.utils.build_legacy_openai_schema(tool)(the framework helper that handles pydantic-model reflection of the function's signature). ForRawFunctionTool, usestool.info.raw_schemaverbatim. Unrecognized types are skipped with a warning, not raised — defensive against future tool variants._build_messages(chat_ctx)rewritten: walkschat_ctx.itemsand dispatches by the.typediscriminator ("message","function_call","function_call_output").FunctionCallitems attach to the most recent assistant message'stool_callslist (Mistral expects them inline on the assistant turn). If no assistant precedes aFunctionCall, an empty-content assistant message is synthesized to carry it.FunctionCallOutputitems become separaterole: "tool"messages keyed bycall_id.- New
_content_to_str(content)static method handles ChatMessage's content-may-be-a-string-or-a-list shape.
voice/agent/custom_plugins/voxtral_tts.py
Both _run methods rewritten to the 1.5.6 _run(self, output_emitter: AudioEmitter) signature, with proper emission via output_emitter.push(bytes). The framework's parent _main_task calls output_emitter.pushed_duration(idx=-1) after _run returns and raises APIError("no audio frames were pushed") if it sees zero — the previous self._event_ch.send_nowait(SynthesizedAudio(...)) path bypassed the emitter and would have failed that check even if the WebRTC track was populated.
VoxtralChunkedStream._run(output_emitter)— non-streaming REST:initialize(stream=False, mime_type="audio/pcm", ...)→ POST →push(response.content)→flush().VoxtralSynthesizeStream._run(output_emitter)— WebSocket streaming:initialize(stream=True, ...)→start_segment(segment_id=...)→ delegate to_run_websocket(emitter)(each binary WS message →emitter.push(message)); on local-endpoint WS failure, fall back to_run_http_fallback(emitter)which collects the input text and posts once.end_segment()+flush()always run viatry/finally.- Both inner methods take the emitter as a parameter;
tts.SynthesizedAudioandutils.audio.AudioFrameimports removed (no remaining consumers);asyncioimport removed (no longer used directly).
voice/agent/custom_plugins/faster_whisper_stt.py
No changes. The _recognize_impl(self, buffer, *, language, conn_options) delegate from PR-I (#692) already complies with the 1.5.6 STT abstract method contract; STT does not use the AudioEmitter model. Re-audited 2026-04-25 against #700.
voice/agent/main.py
Comment-only audit notes by AgentSession(...) (line ~158):
RoomInputOptionsdefaults audited and confirmed compatible with the existing@ctx.room.on("disconnected")handler —close_on_disconnect=Trueis the framework default and matches our use; no override needed.turn_detection=deprecation warning called out with a# TODO(post-v2.0)comment. Backward-compat in 1.5.6; the v2.0 bump (andturn_handling=TurnHandlingOptions(...)migration) is a separate epic per #700 out-of-scope list.
voice/tests/conftest.py
- New
_AudioEmitterstub class (attached as_tts_mod.AudioEmitter) withinitialize,start_segment,end_segment,push,flush,end_input,pushed_durationmethods +pushed_data: list[bytes]accumulator. - New
_FunctionTool/_RawFunctionToolstub classes (attached as_llm_mod.FunctionTool/RawFunctionTool) so the production code'sisinstance()dispatch is exercised by tests. - New
_ChatContextstub class (attached as_llm_mod.ChatContext) replacing the previousMagicMock— exposes.itemsas a list attribute and.messages()as a filtered method. - New
_llm_mod.utilsnamespace withbuild_legacy_openai_schema(tool)returning the OpenAI-shaped{"type": "function", "function": {"name", "description", "parameters"}}dict from the tool's.info. Mirrors the framework's helper so production code can callllm.utils.build_legacy_openai_schema(tool)without runtime import gymnastics in CI.
voice/tests/test_bedrock_mistral_llm.py
_make_chat_context(items)rewritten to mirror 1.5.6's.itemsattribute shape with per-item.typediscriminator. Backwards-compat shorthand: a dict with onlyrole/contentis treated as amessageitem.- Existing
test_tool_call_messageandtest_tool_result_messageupdated to use the items shape (function_call+function_call_outputinstead oftool_callsfield on a ChatMessage). - New
test_function_call_without_assistant_synthesizes_one. - New
test_full_tool_round_trip(user → assistant+FunctionCall → FunctionCallOutput → user) asserts the full Mistralmessagesarray is built correctly. test_request_body_with_toolssplit into three:test_request_body_with_function_tool,test_request_body_with_raw_function_tool,test_unrecognized_tool_skipped.
voice/tests/test_voxtral_tts.py
- New
TestChunkedStreamEmitter::test_chunked_stream_pushes_via_emitter— mocks the httpx post, calls_run(emitter), asserts the emitter was initialized withmime_type="audio/pcm"+stream=Falseand received a singlepush(bytes). - New
TestSynthesizeStreamEmitter::test_synthesize_stream_pushes_segments— patches_run_websocketto push 2 chunks, asserts segment open/close + 2×push+flush. - New
TestSynthesizeStreamEmitter::test_synthesize_stream_falls_back_on_ws_error— forces WS failure on a local endpoint and asserts the HTTP fallback path emits via the same emitter.
Cross-checks (read-only, before the rewrite)
Three pre-flight checks against the upstream livekit/agents source confirmed the API surface before any code change:
livekit-plugins-openai/livekit/plugins/openai/llm.pyandlivekit-agents/livekit/agents/llm/_provider_format/openai.py:to_fnc_ctx— confirmsisinstance(tool, llm.RawFunctionTool)/llm.FunctionTooldiscrimination +llm.utils.build_legacy_openai_schema(tool)for the regular path. Our code mirrors this exactly.livekit-agents/livekit/agents/llm/chat_context.py— confirmsChatItem = ChatMessage | FunctionCall | FunctionCallOutput | AgentHandoff | AgentConfigUpdatediscriminated by.type. Confirms.itemsis a property, not a method. ConfirmsChatMessageno longer has atool_callsfield — tool calls live as separateFunctionCallitems.livekit-agents/livekit/agents/tts/tts.py— confirms_run(self, output_emitter: AudioEmitter)is the 1.5.6 abstract method on bothChunkedStreamandSynthesizeStream. ConfirmsAudioEmitter.push(data: bytes),initialize(*, request_id, sample_rate, num_channels, mime_type, frame_size_ms=200, stream=False),start_segment(*, segment_id),end_segment(),flush(). Confirms parent post-_runvalidationif pushed_duration(idx=-1) <= 0.0: raise APIError(...).
Why now
R-10 Probe 2 on 2026-04-25 (PR #699) verified the deployment infrastructure is correct (5 services healthy, GPU within budget, TLS/WSS reachable, auth gate working, agent dispatches on participant join) but the agent's _llm_inference_task and _tts_inference_task crashed during reply generation with the two errors above. PR-I/J/K/L addressed four prior layers; this completes the alignment as one coherent change rather than a fifth salami fix.
Test plan
- CI:
pytest voice/tests/— 158 tests pass locally (was 122 before; the 36 new tests cover function-tool dispatch, items walk, AudioEmitter push, and the full tool round-trip). -
python -c "from custom_plugins.bedrock_mistral_llm import BedrockMistralLLM; from custom_plugins.voxtral_tts import VoxtralTTS"imports cleanly under the conftest stubs. - Image rebuild on EC2 (~10 s) +
--force-recreate voice-agent. - R-10 Probe 2 re-run against the rebuilt agent — full E2E reply generation passes:
- No
'FunctionTool' object has no attribute 'name'in_llm_inference_task. - No
_run() takes 1 positional argumentin_tts_inference_task. - No
APIError("no audio frames were pushed")after Voxtral synth. - Probe receives non-silent, non-placeholder audio matching the LLM reply duration (not the previous 1.9 MB framework silence).
output_guardpipeline-log stage fires withis_safe.
- No
- Logged in
apps/docs/content/docs/operations/voice-bringup-verification-<merge-date>.mdx(extends the 2026-04-25 doc).
Rollout / reversibility
Reversible via revert. EC2 rebuild + recreate (~10 s). Tested locally end-to-end against the conftest stub layer; the only runtime difference on EC2 is the real livekit-agents 1.5.6 — and we've cross-checked the production code against the upstream patterns exactly.
Out of scope
Per the #700 issue body:
turn_detection=→turn_handling=TurnHandlingOptions(...)migration (v2.0 bump epic).- Production wiring (CSP,
/api/voice/tokenanonymous mode, LIVEKIT_URL on Vercel — release-gated). - Custom Julia voice (replacing
de_femalewith a fine-tuned timbre — product decision). - RT-1 (#673) 97% VRAM ceiling research.
- R-1 / R-2 / R-12 docs work under #660.
Related
- Epic: #700
- R-10 verification: #670 + PR #699 (
voice-bringup-verification-2026-04-25.mdx) - Bring-up parent: #660 (R-0 Voice Deploy Repair)
- Bring-up MEGA: #672
- Prior fix PRs in the salami series: #692, #693, #694, #696
- Original implementation epic: #337 (closed 2026-04-12)
2026-05-02 — Audit cron: DRY_RUN flip after 7-day shake-out
Both audit workflows (closed-issue-audit daily, changelog-audit-weekly Mondays) flipped from DRY_RUN 'true' to 'false' after the 7-day post-merge observation window from PR #684 (merged 2026-04-25) showed clean run logs. closed-issue-audit now writes the missing-changelog label and a single comment to issues closed-as-completed without a matching changelog closes: reference. The weekly digest remains a workflow-run-page summary; no new issue spam.
2026-04-25 — Voice bring-up verification (R-10 partial close)
Next Page