Releases: mrtozner/vox
Releases · mrtozner/vox
v0.6.0
Full Changelog: v0.4.1...v0.6.0
v0.4.1
Full Changelog: v0.4.0...v0.4.1
v0.4.0
Full Changelog: v0.3.0...v0.4.0
v0.3.0 — Streaming STT, 55 Voices, Windows Support
What's New
Streaming Speech-to-Text
Live partial transcriptions while you speak. The WebSocket /v1/listen endpoint now sends partial events every ~1s during speech, so you see words appearing in real time before the final transcription.
55 TTS Voices Across 9 Languages
Kokoro TTS expanded from 21 English voices to 55 voices across 9 languages:
- English (American + British)
- Japanese, Chinese, Spanish, French
- Hindi, Italian, Portuguese
New /v1/voices endpoint lists all available voices dynamically.
Windows CI
Full CI testing on Windows. Zero platform-specific code — all abstracted through cross-platform crates (cpal, dirs, tokio, ort).
Server Improvements
- Model details in
/v1/models— see loaded model names and sizes - Ollama connectivity status checking
- Dynamic voice listing —
/v1/voicesreturns voices from whatever TTS backend is loaded
WebUI
- Live partial transcription display during speech
- Audio level meter
- Copy-to-clipboard for transcriptions
- Timestamps on transcriptions
Install
cargo install --git https://github.com/mrtozner/vox --features cli,kokoro