Skip to content

Releases: mrtozner/vox

v0.6.0

12 Apr 15:20

Choose a tag to compare

v0.4.1

16 Feb 22:41

Choose a tag to compare

Full Changelog: v0.4.0...v0.4.1

v0.4.0

16 Feb 16:38

Choose a tag to compare

Full Changelog: v0.3.0...v0.4.0

v0.3.0 — Streaming STT, 55 Voices, Windows Support

15 Feb 22:35

Choose a tag to compare

What's New

Streaming Speech-to-Text

Live partial transcriptions while you speak. The WebSocket /v1/listen endpoint now sends partial events every ~1s during speech, so you see words appearing in real time before the final transcription.

55 TTS Voices Across 9 Languages

Kokoro TTS expanded from 21 English voices to 55 voices across 9 languages:

  • English (American + British)
  • Japanese, Chinese, Spanish, French
  • Hindi, Italian, Portuguese

New /v1/voices endpoint lists all available voices dynamically.

Windows CI

Full CI testing on Windows. Zero platform-specific code — all abstracted through cross-platform crates (cpal, dirs, tokio, ort).

Server Improvements

  • Model details in /v1/models — see loaded model names and sizes
  • Ollama connectivity status checking
  • Dynamic voice listing/v1/voices returns voices from whatever TTS backend is loaded

WebUI

  • Live partial transcription display during speech
  • Audio level meter
  • Copy-to-clipboard for transcriptions
  • Timestamps on transcriptions

Install

cargo install --git https://github.com/mrtozner/vox --features cli,kokoro

Full Changelog

v0.2.0...v0.3.0