Skip to content

rick12000/vocalance

Repository files navigation

Vocalance Logo

💡 Overview

Vocalance offers hands free control of your computer, enabling you to switch tabs, move on screen, dictate anywhere and much more!

🚀 Website

To find out more about what Vocalance can do, including detailed instructions and guides, refer to the official website:

Vocalance Logo

💻 Installation

Vocalance can be set up entirely from the source code in this repository (currently only supported on Windows).

Easy Setup (Recommended)

Important

Ensure Git is installed. If not, download the latest Git for Windows from git-scm.com/download/win.

Important

Ensure Microsoft C++ Build Tools are installed (required for LLM features). If not, download the installer from Microsoft, run it, and tick "Desktop development with C++" under workloads before completing installation.

To get started with installation, either follow the steps below or watch the installation walkthrough on YouTube.

  1. Open PowerShell (from Windows Start Menu).

  2. Paste and run:

    Invoke-WebRequest -Uri "https://raw.githubusercontent.com/rick12000/vocalance/main/scripts/bootstrapping/setup.ps1" -OutFile "vocalance-setup.ps1"; powershell -ExecutionPolicy Bypass -File .\vocalance-setup.ps1

    If you'd like to inspect what the script will do before running it, view scripts/bootstrapping/setup.ps1 in this repository.

  3. Open Vocalance from the Start Menu (search "vocalance" if not featured):

Vocalance shortcut in the Windows Start menu under Recently added

  • If the application doesn't appear immediately after you clicked it, wait 10-15 seconds before retrying, it may be loading in the background.
  • On first use, the application needs to download your local AI model from a trusted Hugging Face repository (among other essential downloads). Do not close the start up window during this process and allow up to 30 minutes for it to complete depending on internet connection (but should take around 5 minutes for most users).

Then you're good to go! If you haven't already, refer to Vocalance's official website for instructions on how everything works.

Having issues with the installation steps? Reach out at: vocalance.contact@gmail.com


🛠️ Developer Setup

Important

Ensure Microsoft C++ Build Tools are installed (required for LLM features). If not, download the installer from Microsoft, run it, and tick "Desktop development with C++" under workloads before completing installation.

1. Set Up UV

  1. Open Windows PowerShell and enter the script below to install UV (Python package manager):

    powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
  2. Add UV to path (this is specific to this terminal session only, repeat this step every time, or add to permanent path to skip):

    $env:Path = "$HOME\.local\bin;$env:Path"

2. Set Up Vocalance

  1. Create a 3.13.9 virtual environment named vocalance_env with UV:

    uv venv --python 3.13.9 vocalance_env
  2. Activate the environment:

    vocalance_env\Scripts\activate
  3. Clone the repository:

    git clone https://github.com/rick12000/vocalance.git
  4. Go to the repository directory:

    cd vocalance
  5. Install Vocalance from uv.lock:

    uv sync --active
  6. Run the application:

    python vocalance.py

The application will start up and download any required models (like speech recognition models) on first run (these are downloaded from Hugging Face or other reputable hosts). This may take several minutes depending on your internet connection.

Then you're good to go! If you haven't already, refer to Vocalance's official website for instructions on how to get started.

An Aside on Pip

The recommended approach is to install Vocalance with uv, since the developers can freeze and document all recommended dependancies in a uv.lock file, which you then install with uv sync --active.

If you're more familiar with a mixture of a virtual environment manager (eg. venv or conda or pyenv) + pip however, you can absolutely replace above uv steps with your environment manager and replace uv sync --active with pip install . to install Vocalance as a package. Note this is at your discretion, and license disclosures in this repository pertain to pinned package versions in uv.lock.

🧹 Cleanup

To remove Vocalance after installing:

  • Repository and environment: Delete vocalance-prod and vocalance_env folders from your installation directory (you chose this directory when you ran setup.ps1)
  • User data and shortcut: Run scripts/bootstrapping/cleanup.ps1 to remove your personal data and Start Menu shortcut

The cleanup script removes all Vocalance user data stored in %APPDATA%\vocalance_voice_assistant_data\ (settings, downloaded models, custom commands, aliases) and the Start Menu shortcut. It does not remove third-party package caches or system-level tools.

⚠️ Disclaimers

Vocalance is distributed under a GPLv3 license. It makes use of your microphone and, at startup, downloads required assets (such as AI models or text to speech models) from the internet. For a full set of disclaimers and usage warnings, refer to the disclaimer notes.

🔧 System Requirements

  • Operating System: Windows 10/11 (macOS and Linux support planned)
  • RAM: 2GB RAM
  • Disk: 5GB
  • Hardware: It is strongly recommended to purchase a reasonably good headset or microphone to improve Vocalance outputs and recognition, but it will still work without this.

🤝 Contributing

Reach out at vocalance.contact@gmail.com with title "Contribution" if:

  • You have software engineering experience and have feedback on how the architecture of the application could be improved.
  • You want to add an original or pre-approved feature.

For now, contributions will be handled on an ad-hoc basis, but in future contribution guidelines will be set up depending on the number of contributors.

📚 Technical Documentation

If you want to find out more about Vocalance's architecture, refer to the technical documentation on Read the Docs:

  • Overview — End-to-end architecture, the audio pipeline, and how capture, command flow, and dictation relate
  • Capture — Microphone capture, AudioCaptureService, and how audio reaches the rest of the app
  • Command flow — Segmenting, recognition, parsing, and executing voice commands as OS actions
  • Dictation flow — Long-running dictation sessions, recognizers, and typing pipeline
  • User interface — How pipeline events reach the screen and how UI input returns to the bus
  • Event bus — Publish/subscribe model, dispatch, and how services stay decoupled

📈 Upcoming Features

The following features are planned additions to Vocalance, with some in early development and others under consideration:

  • Eye Tracking for Cursor Control: This feature is planned to enable cursor control via eye movements.

    • Gaze Tracking Accuracy: Merge gaze tracking with historical screen click data and screen contents to improve accuracy, aiming for good performance even with webcam tracking.
    • Zoom Option: Add a zoom option to better direct gaze on screen contents.
  • Context-Aware Commands: Implement context bucketing for commands, allowing the same command phrase (e.g., "previous") to map to different hotkeys depending on the active application (e.g., VSCode vs. Chrome). This aims to avoid disambiguation phrases.

  • Improved Text Editing & Navigation: Further enhancements to text editing and text navigation tools.

  • Enhanced Predictive Features: Improve predictive capabilities based on window contents, recent context, gaze patterns, and more.

    • Privacy Note: Any feature requiring local storage of potentially sensitive data (e.g., screenshots, window contents) will be deployed as an opt-in feature and disabled by default.

About

Accessibility software with on-device processing for users with limited mobility.

Topics

Resources

License

Stars

Watchers

Forks

Contributors