fragJulia
Changelog

2026-04-25 — VoxtralTTS.stream + BedrockMistralLLM.chat → sync (livekit-agents 1.5+)

Description

Two custom plugins (voxtral_tts.py, bedrock_mistral_llm.py) had their stream() and chat() methods declared async def, but livekit-agents 1.5+ uses them via async with — which requires the methods to be sync (returning an async-context-manager-capable object), not async (which returns a coroutine).

Same class of API drift as PR-I (FasterWhisperSTT) and PR-J (event handlers). Surfaced during R-10 #670 E2E re-probe immediately after PR-J deployed: pipeline reached Knotencheck agent ready — awaiting user interaction, the agent published a track and the probe received audio frames — but two errors fired in the agent logs during reply generation:

ERROR:livekit.agents:Error in _llm_inference_task
  File "...livekit/agents/voice/agent.py", line 453, in llm_node
    async with activity_llm.chat(
TypeError: 'coroutine' object does not support the asynchronous context manager protocol

ERROR:livekit.agents:Error in _tts_inference_task
  File "...livekit/agents/voice/agent.py", line 480, in tts_node
    async with wrapped_tts.stream(conn_options=conn_options) as stream:
TypeError: VoxtralTTS.stream() got an unexpected keyword argument 'conn_options'

The audio the probe received was almost certainly framework-generated silence/placeholder, not Voxtral synthesis. With these two fixes, the actual reply generation path runs to completion.

What changed

voice/agent/custom_plugins/voxtral_tts.py

  • async def stream(self) -> tts.SynthesizeStreamdef stream(self, *, conn_options: Any = None, **kwargs: Any) -> tts.SynthesizeStream
  • Body unchanged: still returns VoxtralSynthesizeStream(...). The returned instance inherits __aenter__ / __aexit__ from tts.SynthesizeStream, satisfying async with on the framework side.
  • Added Any to the existing from typing import AsyncIterator import.

voice/agent/custom_plugins/bedrock_mistral_llm.py

  • async def chat(self, *, chat_ctx, tools, **kwargs)def chat(self, *, chat_ctx, tools=None, conn_options: Any = None, **kwargs)
  • Body unchanged: still builds the request body and returns BedrockMistralLLMStream(...). The returned instance inherits __aenter__ / __aexit__ from llm.LLMStream.
  • conn_options argument is framework-managed (retries) — accepted and ignored, matching the OpenAI plugin's chat() shape.

Cross-check (read-only)

Cross-referenced against livekit-plugins-openai/llm.py on livekit/agents@main. The reference chat() is:

def chat(
    self,
    *,
    chat_ctx: ChatContext,
    tools: list[llm.Tool] | None = None,
    conn_options: APIConnectOptions = DEFAULT_API_CONNECT_OPTIONS,
    parallel_tool_calls: NotGivenOr[bool] = NOT_GIVEN,
    tool_choice: NotGivenOr[ToolChoice] = NOT_GIVEN,
    response_format: ...,
    extra_kwargs: NotGivenOr[dict[str, Any]] = NOT_GIVEN,
) -> LLMStream:

— sync def, no async. Our minimal port matches the sync-ness and the conn_options parameter; everything else (parallel_tool_calls, tool_choice, etc.) is OpenAI-specific and not relevant to Bedrock.

Test plan

  • Tests don't call chat() or stream() directly (only init paths) — verified via grep over voice/tests/.
  • Image rebuild on EC2 succeeds (~3 s; only the COPY layer regenerates).
  • After --force-recreate voice-agent + re-run /tmp/probe2_e2e.py: no coroutine ... async context manager error in agent logs; no unexpected keyword argument 'conn_options'.
  • _llm_inference_task runs to completion, calling Bedrock; _tts_inference_task runs to completion, synthesizing via Voxtral.
  • Probe receives non-silent audio response (length corresponds to TTS output, not framework placeholder).

Rollout / reversibility

Reversible via revert. EC2 image rebuild + recreate (~10 s combined).

Out of scope

  • turn_detection= deprecation warning on AgentSession(...) (still backward-compat, follow-up).
  • Production wiring (release-gated).

On this page