feat: add multilingual TTS support via ephone (eSpeak-NG WASM) by Dongshan-git · Pull Request #313 · hexgrad/kokoro

Dongshan-git · 2026-04-09T02:10:56Z

Summary

Replace the English-only phonemizer package with ephone v1.0.2, an eSpeak-NG WASM wrapper with built-in language packs for 9 languages (en-US, en-GB, es, fr, it, pt-BR, ja, zh, hi)
Fix a hang in KokoroTTS.stream() where splitter.close() was never called, leaving the async iterator blocked forever after all chunks were pushed
Uncomment all non-English voices in voices.js (Japanese, Chinese, Spanish, French, Hindi, Italian, Portuguese)
Redesign the browser demo with a grouped voice selector, waveform visualizer, and per-language example texts

Changes

src/phonemize.js

Rewrite around createEphone from the ephone package
Language packs (en_us, en_all, roa, jpx, sit) are imported as static top-level imports; Hindi's large all pack is lazy-loaded on demand
normalize_text now takes an english flag — number/currency/abbreviation normalization is skipped for non-English input
English post-processing (r → ɹ, kokoro pronunciation fix) is gated behind isEnglish so it doesn't corrupt romance-language phonemes

src/kokoro.js

Add missing splitter.close() call after splitter.push(...chunks) — without this the TextSplitterStream async iterator never resolves
Await this.tokenizer(...) (was missing await, causing silent failures)
Extend _validate_voice language type annotation to cover all 9 language codes

src/voices.js

Enable all previously-commented-out non-English voices

rollup.config.js

Switch web build from file: "kokoro.web.js" to dir + entryFileNames/chunkFileNames to support dynamic imports (ephone language packs are code-split chunks)

demo/

Grouped voice selector with language flags and quality grades
WaveformPlayer component with canvas waveform visualization
Per-language example texts
AnimatePresence loading/result transitions

Test plan

English (af_heart, bf_emma) — verify existing behaviour unchanged
French (ff_siwis) — was completely broken before; should now synthesize correctly
Japanese (jf_alpha) — hiragana/katakana input; kanji not supported by eSpeak-NG
Chinese (zf_xiaoxiao), Spanish (ef_dora), Hindi (hf_alpha), Italian (if_sara), Portuguese (pf_dora)
Run vitest — all phonemize tests should pass

🤖 Generated with Claude Code

Replace the English-only `phonemizer` package with `ephone` v1.0.2, an eSpeak-NG WASM wrapper that ships language packs for 9 languages. Fix a hang in `KokoroTTS.stream()` where `splitter.close()` was never called, leaving the async iterator blocked forever. - phonemize.js: rewrite around `createEphone`; lazy-load the large Hindi 'all' pack on demand; skip English-specific r→ɹ for romance langs - kokoro.js: call `splitter.close()` after pushing chunks; await tokenizer - voices.js: uncomment all non-English voices (ja/zh/es/fr/hi/it/pt-br) - rollup.config.js: switch web build to dir+chunkFileNames for dynamic imports - demo: redesign UI with grouped voice selector, waveform player, per-language example texts, and AnimatePresence transitions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Dongshan-git · 2026-04-09T02:14:06Z

Also bumps @huggingface/transformers from ^3.5.1 to ^4.0.1 and dev dependencies (rollup, vitest, typescript, prettier) to their latest major versions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add multilingual TTS support via ephone (eSpeak-NG WASM)#313

feat: add multilingual TTS support via ephone (eSpeak-NG WASM)#313
Dongshan-git wants to merge 1 commit into
hexgrad:mainfrom
Dongshan-git:feat/multilingual-ephone

Dongshan-git commented Apr 9, 2026

Uh oh!

Dongshan-git commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Dongshan-git commented Apr 9, 2026

Summary

Changes

Test plan

Uh oh!

Dongshan-git commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant