feat(minimax): port MiniMax TTS plugin from Python agents by toubatbrian · Pull Request #1287 · livekit/agents-js

toubatbrian · 2026-04-22T06:29:02Z

Summary

Ports the MiniMax TTS plugin from livekit/agents (Python) into agents-js.

Triggered by automated routine observing merged PR livekit/agents#5518 ("(minimax): add new TTS models") — the Python change simply added speech-2.8-hd and speech-2.8-turbo to the already-existing MiniMax plugin. Because the MiniMax plugin did not yet exist in agents-js, this PR creates the full plugin scaffold (including the new 2.8 models) rather than attempting to land the two model literals alone.

Closes the JS-side gap with livekit-plugins-minimax (Python).

What's included

New workspace package @livekit/agents-plugin-minimax at plugins/minimax/:

package.json, tsconfig.json, tsup.config.ts, api-extractor.json, README.md — matches the conventions of existing plugins (e.g. rime, cartesia, neuphonic).
src/index.ts — plugin registration.
src/models.ts — literal types (TTSModel, TTSVoice, TTSEmotion, TTSLanguageBoost, TTSSampleRate), defaults (DEFAULT_MODEL, DEFAULT_VOICE_ID, DEFAULT_BASE_URL). Includes the new speech-2.8-hd and speech-2.8-turbo model strings from (minimax): add new TTS models agents#5518.
src/tts.ts:
- TTS class with the same capability surface as the Python version (streaming: true, alignedTranscript: false).
- ChunkedStream: one-shot synthesis via HTTP SSE (POST /v1/t2a_v2, stream: true, exclude_aggregated_audio: true), hex-decoding the audio chunks and pushing PCM frames into an AudioByteStream.
- SynthesizeStream: real-time WebSocket synthesis via /ws/v1/t2a_v2 with the task_start / task_continue / task_finish event protocol. Uses a sentence tokenizer to chunk incoming text.
- updateOptions() parity with the Python update_options.
- Input validation mirrors Python: speed ∈ [0.5, 2.0], intensity ∈ [-100, 100], timbre ∈ [-100, 100], and the fluent emotion is only accepted for speech-2.6-* models.
- Error surfacing via APIConnectionError / APIStatusError / APITimeoutError / APIError with MiniMax trace_id propagation on both HTTP and WS paths.

Implementation notes where JS differs from Python

Code-level parity is mostly 1:1, except:

Audio format is restricted to PCM. The Python plugin exposes a audio_format option with pcm | mp3 | flac | wav. @livekit/agents's AudioByteStream is designed around raw PCM samples — decoding MP3/FLAC/WAV on the fly would require pulling in an external decoder and wiring it through the TTS pipeline, which is out of scope for the initial port. format: "pcm" is always sent on the wire, and the incoming hex audio is fed straight into AudioByteStream. The public bitrate option is still accepted for API parity, but is effectively ignored by MiniMax when format=pcm.
No SentenceStreamPacer. Python's optional text_pacing (a sentence-level pacer that coordinates with the audio emitter) does not have a counterpart in @livekit/agents at the moment, so the option is omitted. Users can still pass a custom SentenceTokenizer via tokenizer.
HTTP client. Python uses aiohttp; the JS port uses fetch for the chunked HTTP path (matches elevenlabs) and ws for the streaming path (matches cartesia / neuphonic).
Request ID / trace ID. Python extracts Trace-Id / X-Trace-Id from response headers and trace_id from the body (both root.trace_id and base_resp.trace_id). JS does the same, preferring header, falling back to body.
py.typed marker is not applicable in the JS world; types ship via the generated dist/index.d.ts.

Test plan

pnpm install at repo root picks up the new workspace.
pnpm --filter @livekit/agents-plugin-minimax build succeeds.
pnpm --filter @livekit/agents-plugin-minimax lint is clean.
Manual smoke test: new TTS({ apiKey }).synthesize("hello world") returns PCM audio.
Manual smoke test: new TTS({ apiKey }).stream() with a pushed sentence emits framed PCM chunks end-to-end.
Verify that passing emotion: 'fluent' with a non-speech-2.6-* model throws, matching Python behavior.

This is an automated port from livekit/agents#5518 by the Claude Code automation routine (experimental).

cc @toubatbrian @livekit/agent-devs

changeset-bot · 2026-04-22T06:29:07Z

🦋 Changeset detected

Latest commit: c34b65d

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 27 packages

Name	Type
@livekit/agents-plugin-minimax	Major
@livekit/agents	Major
@livekit/agents-plugin-anam	Major
@livekit/agents-plugin-assemblyai	Major
@livekit/agents-plugin-baseten	Major
@livekit/agents-plugin-bey	Major
@livekit/agents-plugin-cartesia	Major
@livekit/agents-plugin-cerebras	Major
@livekit/agents-plugin-deepgram	Major
@livekit/agents-plugin-elevenlabs	Major
@livekit/agents-plugin-google	Major
@livekit/agents-plugin-hedra	Major
@livekit/agents-plugin-inworld	Major
@livekit/agents-plugin-lemonslice	Major
@livekit/agents-plugin-livekit	Major
@livekit/agents-plugin-mistral	Major
@livekit/agents-plugin-neuphonic	Major
@livekit/agents-plugin-openai	Major
@livekit/agents-plugin-phonic	Major
@livekit/agents-plugin-resemble	Major
@livekit/agents-plugin-rime	Major
@livekit/agents-plugin-runway	Major
@livekit/agents-plugin-sarvam	Major
@livekit/agents-plugin-silero	Major
@livekit/agents-plugin-trugen	Major
@livekit/agents-plugin-xai	Major
@livekit/agents-plugins-test	Major

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 05c9ccdba6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-22T06:36:31Z

+          const audioHex = data?.data?.audio as string | undefined;
+          if (audioHex) {
+            const audio = hexToBuffer(audioHex);
+            for (const frame of bstream.write(audio.buffer)) {


Preserve byte offsets when writing decoded audio

Both synthesis paths decode hex with Buffer.from(...) and then pass audio.buffer to AudioByteStream. In Node this Buffer is typically a slice of a pooled ArrayBuffer (byteOffset is non-zero), so using .buffer feeds unrelated bytes before/after the actual chunk, which corrupts PCM output and frame boundaries. Pass the Buffer/view directly (or slice the backing buffer with byteOffset and byteLength) here and in the matching WebSocket path.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-22T06:36:31Z

@@ -0,0 +1,52 @@
+{
+  "name": "@livekit/agents-plugin-minimax",


Update lockfile when adding the new workspace package

This commit introduces plugins/minimax/package.json as a new workspace package, but pnpm-lock.yaml was not regenerated to include a plugins/minimax importer. Because CI runs pnpm install --frozen-lockfile in build/test workflows, installs will fail when the lockfile is out of sync with workspace manifests. Regenerating and committing the lockfile with this package is required to keep CI green.

Useful? React with 👍 / 👎.

- Regenerate pnpm-lock.yaml so frozen-lockfile install finds the new minimax workspace package (otherwise Build/Test/Formatting all fail at install time). - Add MINIMAX_API_KEY and MINIMAX_BASE_URL to turbo.json globalEnv so eslint-config-turbo stops rejecting the env var references. - Pass the Node.js Buffer directly to AudioByteStream.write instead of unwrapping to Buffer.buffer. Node pools small Buffers inside a larger ArrayBuffer, so .buffer exposed ~8KB of unrelated pool memory and corrupted PCM output (flagged by Devin and Codex review). - Fix three TS2322 errors: APIStatusError.options.body must be an object or null, not a JSON string. - Add tts.test.ts with the standard 'skip when API key missing' pattern used by rime/cartesia/neuphonic. - Add changeset entry.

toubatbrian · 2026-04-22T06:45:30Z

Thanks for the reviews @devin-ai-integration @chatgpt-codex-connector — addressed in 105f587:

Buffer pool bug (both HTTP + WebSocket paths): switched bstream.write(audio.buffer) → bstream.write(audio) so the ArrayBufferView branch in AudioByteStream.write correctly uses byteOffset/byteLength and avoids feeding unrelated pooled memory.
Lockfile: regenerated pnpm-lock.yaml with the new plugins/minimax importer so the CI pnpm install --frozen-lockfile step succeeds.

Also fixed while in the area:

Added MINIMAX_API_KEY / MINIMAX_BASE_URL to turbo.json globalEnv (turbo lint rule).
Fixed three TS2322 errors: APIStatusError.options.body must be an object | null, not a JSON string.
Added plugins/minimax/src/tts.test.ts matching the skip-when-no-API-key pattern used by rime/cartesia/neuphonic.
Added a changeset.

Generated by Claude Code

chatgpt-codex-connector · 2026-04-22T06:47:57Z

Summary

Added a MiniMax regression test that validates buffer-view byte-offset handling by passing a sliced Buffer into AudioByteStream.write() and asserting only the intended PCM bytes are decoded; this protects against reintroducing the .buffer pooled-memory corruption issue noted in review. plugins/minimax/src/tts.test.tsL22-L34
Updated test imports to include expect and AudioByteStream needed for the new regression case. plugins/minimax/src/tts.test.tsL4-L6

Testing

⚠️ pnpm install --frozen-lockfile (failed in this environment: onnxruntime-node postinstall download hit ENETUNREACH, so install did not complete).
⚠️ pnpm test -- plugins/minimax/src/tts.test.ts (could not run because vitest was unavailable after the failed install).
✅ git commit -m "test(minimax): add regression for buffer view byte offsets"

View task →

CLAassistant · 2026-04-22T06:53:38Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ toubatbrian
❌ claude
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

- updateOptions now re-validates merged opts via an extracted validateOptions helper, so speed/intensity/timbre/emotion+model constraints surface locally rather than as a server-side error. (Python's update_options silently assigns; the JS port tightens this.) - Add the CLAUDE.md-required // Ref: python comments to every ported symbol in src/tts.ts and src/models.ts so reviewers can cross-check each function/type against the Python source.

Promise.race between taskStarted.await and the setTimeout-based timeout promise leaked the timer. On successful task_start, the timer still fired and called reject() on an already-settled race, producing an unhandled promise rejection. Capture the handle and clearTimeout in a finally block. Matches the pattern used by waitForWebSocketOpen later in the same file and by cartesia.

#tokenStream is created once in the constructor; the TTS base class retries run() on retryable errors (timeout, 5xx, 429), so closing the stream here made every retry push into a closed stream and silently drop user input. Only close() closes it now, matching how cartesia handles its tokenizer stream.

…tryable - All new files now use SPDX-FileCopyrightText: 2026 per CLAUDE.md. - When the server returns a non-zero base_resp.status_code, pass retryable: false to APIStatusError. These are MiniMax app-level codes (e.g. 1002 invalid param, 1004 auth), not HTTP status codes, so the default retryability heuristic would incorrectly retry permanent errors. Applies to both the HTTP SSE and WebSocket paths.

toubatbrian added 8 commits April 22, 2026 14:25

feat(minimax): add package.json

e5fd90b

feat(minimax): add tsconfig.json

fd6a9d9

feat(minimax): add tsup config

d40a2ac

feat(minimax): add api-extractor config

b696a86

feat(minimax): add README

f8b2bc3

feat(minimax): add plugin entry point

0b14ea5

feat(minimax): add models and literal types

1463303

feat(minimax): add TTS implementation (HTTP SSE + WebSocket streaming)

05c9ccd

toubatbrian mentioned this pull request Apr 22, 2026

(minimax): add new TTS models livekit/agents#5518

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(minimax): port MiniMax TTS plugin from Python agents#1287

feat(minimax): port MiniMax TTS plugin from Python agents#1287
toubatbrian wants to merge 13 commits intomainfrom
claude/jolly-lovelace-blf3Z

toubatbrian commented Apr 22, 2026

Uh oh!

changeset-bot Bot commented Apr 22, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 22, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 22, 2026

Uh oh!

toubatbrian commented Apr 22, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 22, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

CLAassistant commented Apr 22, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

toubatbrian commented Apr 22, 2026

Summary

What's included

Implementation notes where JS differs from Python

Test plan

Uh oh!

changeset-bot Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

This comment was marked as resolved.

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

toubatbrian commented Apr 22, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 22, 2026

Summary

Uh oh!

This comment was marked as resolved.

Uh oh!

CLAassistant commented Apr 22, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

changeset-bot Bot commented Apr 22, 2026 •

edited

Loading