Building on Windows, a Complete Troubleshooting Guide

# Building pywhispercpp with CUDA on Windows: A Complete Troubleshooting Guide

This guide documents every step, failure, and fix encountered while building
[pywhispercpp](https://github.com/absadiki/pywhispercpp) from source with CUDA
support on Windows 10 with an NVIDIA RTX 4090.

---

## Environment

| Component | Version / Path |
|---|---|
| OS | Windows 10 Pro 10.0.19045 |
| GPU | NVIDIA RTX 4090 |
| Python | 3.12 (virtual environment at `D:\Python_Programs\bench_STT_whispercpp\`) |
| CUDA Toolkit | 12.8, installed at `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8` |
| Visual Studio | 2022 Build Tools at `C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools` |
| CMake | 4.2 |
| PyTorch | 2.9.0+cu128 |

---

## Why Build from Source?

pywhispercpp does **not** publish prebuilt CUDA wheels. The PyPI wheels are
CPU-only. To get GPU-accelerated inference via whisper.cpp's CUDA backend, you
must build the C++ extension from source with `GGML_CUDA=1`.

---

## Step-by-Step Build Process (What Finally Worked)

### 1. Clone and patch setup.py

```bash
git clone https://github.com/absadiki/pywhispercpp.git _build_pywhispercpp
cd _build_pywhispercpp
```

**Critical patch**: pywhispercpp's `setup.py` (around lines 153-154) contains
code that dumps **every environment variable** as a CMake `-D` flag:

```python
# REMOVE THESE LINES from setup.py:
for key, value in os.environ.items():
    cmake_args.append(f'-D{key}={value}')
```

This causes CMake to choke on Windows environment variables that contain spaces,
semicolons, parentheses, and other special characters (e.g., `ProgramFiles(x86)`,
`PATH` with hundreds of entries, etc.). **Delete or comment out these two lines.**

### 2. Set environment variables and build

Create a batch file (e.g., `_build.bat`) with:

```batch
@echo off
set CMAKE_GENERATOR=Visual Studio 17 2022
set CMAKE_ARGS=-DGGML_CUDA=on
set GGML_CUDA=1
set FORCE_CMAKE=1
set NO_REPAIR=1

pip install . --no-build-isolation --no-cache-dir
```

Key variables explained:

| Variable | Why It's Needed |
|---|---|
| `CMAKE_GENERATOR=Visual Studio 17 2022` | CMake's default generator cannot find the CUDA VS integration. The VS 2022 generator has proper CUDA toolkit integration via the BuildTools installation. |
| `CMAKE_ARGS=-DGGML_CUDA=on` | Tells the whisper.cpp CMake build to enable the CUDA backend. |
| `GGML_CUDA=1` | Some code paths in setup.py also check this variable. |
| `FORCE_CMAKE=1` | Forces CMake-based build instead of any fallback. |
| `NO_REPAIR=1` | Skips the `repairwheel` step which fails on Windows (see below). |

Run the batch file from a **regular command prompt** (not from inside vcvarsall
— the VS 2022 generator handles MSVC detection on its own).

### 3. Copy the dependent DLLs

After the wheel installs, the `.pyd` extension module (`_pywhispercpp.pyd`) is
placed in `Lib/site-packages/`, but the shared libraries it depends on are left
inside the build tree. You must manually copy them to `site-packages/`:

```
ggml.dll
ggml-base.dll
ggml-cpu.dll
ggml-cuda.dll
whisper.dll
```

Find them inside the build artifacts (typically under
`_build_pywhispercpp/build/` or the pip temp build directory) and copy to:

```
<venv>/Lib/site-packages/
```

### 4. Configure DLL search paths at runtime

Even with the DLLs in site-packages, Python on Windows won't find them unless
you explicitly add the directories to both `PATH` and `os.add_dll_directory()`.
This must happen **before** `import _pywhispercpp`.

The function below handles this (place it at the top of any script that uses
pywhispercpp, and call it before any imports):

```python
import sys
import os
import platform
from pathlib import Path

def set_cuda_paths():
    if platform.system() != "Windows":
        return
    venv_base = Path(sys.executable).parent.parent
    nvidia_base = venv_base / "Lib" / "site-packages" / "nvidia"
    site_packages = venv_base / "Lib" / "site-packages"

    paths_to_add = [
        site_packages,  # whisper.cpp DLLs (ggml-cuda.dll, whisper.dll, etc.)
    ]

    if nvidia_base.exists():
        paths_to_add += [
            nvidia_base / "cuda_runtime" / "bin",
            nvidia_base / "cuda_runtime" / "lib" / "x64",
            nvidia_base / "cuda_runtime" / "include",
            nvidia_base / "cublas" / "bin",
            nvidia_base / "cudnn" / "bin",
            nvidia_base / "cuda_nvrtc" / "bin",
            nvidia_base / "cuda_nvcc" / "bin",
        ]

    # System CUDA toolkit
    cuda_path = os.environ.get(
        "CUDA_PATH",
        r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8"
    )
    cuda_bin = Path(cuda_path) / "bin"
    if cuda_bin.exists():
        paths_to_add.append(cuda_bin)

    current_value = os.environ.get("PATH", "")
    new_value = os.pathsep.join(
        [str(p) for p in paths_to_add]
        + ([current_value] if current_value else [])
    )
    os.environ["PATH"] = new_value

    if nvidia_base.exists():
        triton_cuda_path = nvidia_base / "cuda_runtime"
        current_cuda_path = os.environ.get("CUDA_PATH", "")
        new_cuda_path = os.pathsep.join(
            [str(triton_cuda_path)]
            + ([current_cuda_path] if current_cuda_path else [])
        )
        os.environ["CUDA_PATH"] = new_cuda_path

    if hasattr(os, "add_dll_directory"):
        for path in paths_to_add:
            if Path(path).exists():
                try:
                    os.add_dll_directory(str(path))
                except OSError:
                    pass

set_cuda_paths()
```

**Why both PATH and add_dll_directory?**
- `os.add_dll_directory()` (Python 3.8+) is required because Python 3.8 changed
  DLL search behavior on Windows — it no longer searches PATH by default for
  extension module dependencies.
- Prepending to `PATH` is still needed because some DLLs loaded by the CUDA
  runtime itself (e.g., `cublas64_*.dll`) use the legacy LoadLibrary search
  which does check PATH.

---

## Every Error Encountered (and How It Was Fixed)

### Error 1: CMake dumps all env vars as -D flags

**Symptom:**
```
CMake Error: cmake -DProgramFiles(x86)=C:\Program Files (x86) ...
```
CMake crashes with parse errors on Windows environment variables containing
spaces, parentheses, and special characters.

**Cause:** `setup.py` lines 153-154 iterate over `os.environ.items()` and pass
every single variable as a `-D` flag to CMake.

**Fix:** Delete or comment out these two lines in `setup.py`:
```python
for key, value in os.environ.items():
    cmake_args.append(f'-D{key}={value}')
```

---

### Error 2: No CUDA toolset found (default CMake generator)

**Symptom:**
```
CMake Error: No CUDA toolset found.
```

**Cause:** CMake 4.2's default generator (Ninja or NMake) doesn't know where the
CUDA VS integration files are located. The CUDA Toolkit installs VS integration
files specifically for the Visual Studio generators.

**Fix:** Explicitly set the generator:
```
set CMAKE_GENERATOR=Visual Studio 17 2022
```
The Visual Studio 2022 generator has built-in support for finding the CUDA
toolkit's VS integration (installed at
`C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\extras\visual_studio_integration`).

---

### Error 3: Ninja generator + vcvarsall.bat fails

**Symptom (when trying Ninja instead of VS generator):**
```
vcvarsall.bat: The input line is too long.
```
or various path-escaping errors when trying to source vcvarsall.bat from a bash
shell via `cmd //c`.

**Cause:** The BuildTools vcvarsall.bat has trouble when invoked from certain
shell environments, and the extremely long PATH on the system exceeds cmd.exe's
line length limits.

**Fix:** Don't use Ninja. Use `Visual Studio 17 2022` generator instead — it
doesn't need vcvarsall.bat because it invokes MSBuild directly, which handles
the compiler environment internally.

---

### Error 4: repairwheel "[WinError 2] The system cannot find the file specified"

**Symptom:**
```
[WinError 2] The system cannot find the file specified
```
during the `repairwheel` post-build step.

**Cause:** `repairwheel` (or `delvewheel`) is not installed or not on PATH in
the build environment. pywhispercpp's build tries to run it to bundle DLLs into
the wheel, but the tool is missing.

**Fix:** Skip the repair step entirely:
```
set NO_REPAIR=1
```
Then manually copy the DLLs yourself (Step 3 above). This is actually more
reliable on Windows because repairwheel sometimes misses CUDA-specific DLLs
anyway.

---

### Error 5: `ImportError: DLL load failed while importing _pywhispercpp`

**Symptom:**
```
ImportError: DLL load failed while importing _pywhispercpp:
The specified module could not be found.
```

**Cause:** `_pywhispercpp.pyd` depends on `whisper.dll`, `ggml.dll`,
`ggml-cuda.dll`, etc. Even though these files exist, Python 3.8+ on Windows
does not search PATH or the current directory for DLL dependencies of extension
modules. You must explicitly register DLL directories.

**Fix:** Call `os.add_dll_directory()` for every directory containing required
DLLs **before** importing `_pywhispercpp`. The three critical directories are:
1. `<venv>/Lib/site-packages/` — where whisper.dll, ggml-*.dll live
2. `<nvidia packages>/cuda_runtime/bin/` — CUDA runtime DLLs from pip-installed nvidia packages
3. `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\` — system CUDA toolkit DLLs

See the `set_cuda_paths()` function in Step 4 above.

---

### Error 6: `from pywhispercpp import _pywhispercpp` fails

**Symptom:**
```
ImportError: cannot import name '_pywhispercpp' from 'pywhispercpp'
```

**Cause:** `_pywhispercpp` is a **top-level** Python extension module (compiled
as `_pywhispercpp.pyd` in site-packages), not a submodule of the `pywhispercpp`
package.

**Fix:** Import it as a top-level module:
```python
# Wrong:
from pywhispercpp import _pywhispercpp

# Correct:
import _pywhispercpp as pw
```

In practice you rarely need to import it directly — the `pywhispercpp.model.Model`
class wraps it.

---

### Error 7: `'whisper_full_params' object has no attribute 'use_gpu'`

**Symptom:**
```
AttributeError: '_pywhispercpp.whisper_full_params' object has no attribute 'use_gpu'
```

**Cause:** Unlike some other Python Whisper bindings, whisper.cpp does **not**
have a runtime `use_gpu` toggle. GPU usage is determined entirely at compile
time. If the library was built with `GGML_CUDA=on`, it uses the GPU
automatically. There is no way to disable it at runtime via the Python API.

**Fix:** Remove any `use_gpu=True` arguments from `Model()` or
`model.transcribe()` calls. The `--device` flag in the benchmark script only
controls whether VRAM monitoring is enabled, not whether the GPU is used for
inference.

---

### Error 8: `'whisper_full_params' object has no attribute 'beam_size'`

**Symptom:**
```
AttributeError: '_pywhispercpp.whisper_full_params' object has no attribute 'beam_size'
```

**Cause:** `beam_size` is not a top-level attribute of `whisper_full_params`.
It's nested inside the `beam_search` sub-dictionary. The params struct has two
nested strategy configs:
- `params.greedy` → `{'best_of': -1}`
- `params.beam_search` → `{'beam_size': 5, 'patience': -1.0}`

**Fix:** Set it via the nested dict after constructing the model:
```python
# First, tell the Model to use beam search sampling strategy
model = Model(
    model_path,
    params_sampling_strategy=1,  # 0 = greedy, 1 = beam search
    n_threads=4,
)

# Then set beam_size on the nested dict
model._params.beam_search['beam_size'] = beam_size
```

---

### Error 9: Audio file rejected (wrong sample rate)

**Symptom:**
```
Exception: WAV file must be 16000 Hz
```

**Cause:** whisper.cpp (and the original Whisper model) requires 16 kHz mono
audio. pywhispercpp's built-in WAV loader strictly enforces this — it does not
resample. Non-WAV files (FLAC, MP3, etc.) are converted via ffmpeg, which
handles resampling automatically.

**Fix:** Either:
- Use a non-WAV format and ensure ffmpeg is installed (pywhispercpp will convert
  it automatically to 16kHz WAV via ffmpeg)
- Pre-convert your WAV files:
  ```bash
  ffmpeg -i input.wav -ac 1 -ar 16000 output_16k.wav -y
  ```

---

## Summary: The Minimal Working Recipe

```bash
# 1. Clone
git clone https://github.com/absadiki/pywhispercpp.git _build_pywhispercpp
cd _build_pywhispercpp

# 2. Patch setup.py — remove the env-var-dumping lines (around line 153-154):
#    for key, value in os.environ.items():
#        cmake_args.append(f'-D{key}={value}')

# 3. Build (from a regular cmd.exe prompt)
set CMAKE_GENERATOR=Visual Studio 17 2022
set CMAKE_ARGS=-DGGML_CUDA=on
set GGML_CUDA=1
set FORCE_CMAKE=1
set NO_REPAIR=1
pip install . --no-build-isolation --no-cache-dir

# 4. Copy DLLs from the build tree to site-packages:
#    ggml.dll, ggml-base.dll, ggml-cpu.dll, ggml-cuda.dll, whisper.dll

# 5. In your Python scripts, call set_cuda_paths() BEFORE importing pywhispercpp
```

---

## Verifying the Build

Run these checks to confirm everything works:

```bash
# Smoke test — imports and system info
python -c "
import os, sys, platform
from pathlib import Path
# (call set_cuda_paths() here)
import _pywhispercpp as pw
print(pw.whisper_print_system_info())
"

# Should print system info including CUDA-related flags like:
#   CUDA = 1
#   COREML = 0
#   OPENVINO = 0
#   ...

# Inference test with tiny model
python test_inference.py --model tiny.en

# Full benchmark
python bench_whispercpp.py --model tiny.en --audio your_file.wav
```

---

## API Gotchas Reference

| Pitfall | Details |
|---|---|
| No `use_gpu` parameter | GPU is always on if built with CUDA. No runtime toggle. |
| `beam_size` is nested | Access via `model._params.beam_search['beam_size']`, not as a flat param. |
| `_pywhispercpp` is top-level | `import _pywhispercpp`, not `from pywhispercpp import _pywhispercpp`. |
| WAV must be 16kHz 16-bit | Use ffmpeg to convert, or use non-WAV formats (auto-converted). |
| DLL paths on Windows | Must call `os.add_dll_directory()` before importing the extension. |
| `params_sampling_strategy` | `0` = greedy, anything else = beam search. Set in `Model()` constructor. |


Pitfall	Details
No `use_gpu` parameter	GPU is always on if built with CUDA. No runtime toggle.
`beam_size` is nested	Access via `model._params.beam_search['beam_size']`, not as a flat param.
`_pywhispercpp` is top-level	`import _pywhispercpp`, not `from pywhispercpp import _pywhispercpp`.
WAV must be 16kHz 16-bit	Use ffmpeg to convert, or use non-WAV formats (auto-converted).
DLL paths on Windows	Must call `os.add_dll_directory()` before importing the extension.
`params_sampling_strategy`	`0` = greedy, anything else = beam search. Set in `Model()` constructor.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Building on Windows, a Complete Troubleshooting Guide #156

Building pywhispercpp with CUDA on Windows: A Complete Troubleshooting Guide

Environment

Why Build from Source?

Step-by-Step Build Process (What Finally Worked)

1. Clone and patch setup.py

2. Set environment variables and build

3. Copy the dependent DLLs

4. Configure DLL search paths at runtime

Every Error Encountered (and How It Was Fixed)

Error 1: CMake dumps all env vars as -D flags

Error 2: No CUDA toolset found (default CMake generator)

Error 3: Ninja generator + vcvarsall.bat fails

Error 4: repairwheel "[WinError 2] The system cannot find the file specified"

Error 5: `ImportError: DLL load failed while importing _pywhispercpp`

Error 6: `from pywhispercpp import _pywhispercpp` fails

Error 7: `'whisper_full_params' object has no attribute 'use_gpu'`

Error 8: `'whisper_full_params' object has no attribute 'beam_size'`

Error 9: Audio file rejected (wrong sample rate)

Summary: The Minimal Working Recipe

Verifying the Build

API Gotchas Reference

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Component	Version / Path
OS	Windows 10 Pro 10.0.19045
GPU	NVIDIA RTX 4090
Python	3.12 (virtual environment at `D:\Python_Programs\bench_STT_whispercpp\`)
CUDA Toolkit	12.8, installed at `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8`
Visual Studio	2022 Build Tools at `C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools`
CMake	4.2
PyTorch	2.9.0+cu128

Variable	Why It's Needed
`CMAKE_GENERATOR=Visual Studio 17 2022`	CMake's default generator cannot find the CUDA VS integration. The VS 2022 generator has proper CUDA toolkit integration via the BuildTools installation.
`CMAKE_ARGS=-DGGML_CUDA=on`	Tells the whisper.cpp CMake build to enable the CUDA backend.
`GGML_CUDA=1`	Some code paths in setup.py also check this variable.
`FORCE_CMAKE=1`	Forces CMake-based build instead of any fallback.
`NO_REPAIR=1`	Skips the `repairwheel` step which fails on Windows (see below).

Building on Windows, a Complete Troubleshooting Guide #156

Description

Building pywhispercpp with CUDA on Windows: A Complete Troubleshooting Guide

Environment

Why Build from Source?

Step-by-Step Build Process (What Finally Worked)

1. Clone and patch setup.py

2. Set environment variables and build

3. Copy the dependent DLLs

4. Configure DLL search paths at runtime

Every Error Encountered (and How It Was Fixed)

Error 1: CMake dumps all env vars as -D flags

Error 2: No CUDA toolset found (default CMake generator)

Error 3: Ninja generator + vcvarsall.bat fails

Error 4: repairwheel "[WinError 2] The system cannot find the file specified"

Error 5: ImportError: DLL load failed while importing _pywhispercpp

Error 6: from pywhispercpp import _pywhispercpp fails

Error 7: 'whisper_full_params' object has no attribute 'use_gpu'

Error 8: 'whisper_full_params' object has no attribute 'beam_size'

Error 9: Audio file rejected (wrong sample rate)

Summary: The Minimal Working Recipe

Verifying the Build

API Gotchas Reference

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Error 5: `ImportError: DLL load failed while importing _pywhispercpp`

Error 6: `from pywhispercpp import _pywhispercpp` fails

Error 7: `'whisper_full_params' object has no attribute 'use_gpu'`

Error 8: `'whisper_full_params' object has no attribute 'beam_size'`