Add local repo loading and strip default inference weight normalization by JINZIPING · Pull Request #311 · hexgrad/kokoro

JINZIPING · 2026-04-05T10:00:35Z

Title

Reduce inference GPU memory and add optional local model loading

Description

This PR reduces GPU memory usage by stripping weight_norm parametrizations after model weights are loaded, and adds an optional parameter for loading model files and voices from a local repo-shaped directory instead of downloading from Hugging Face.

What changed:

strip weight_norm by default in inference mode after loading weights
keep an escape hatch with for_training=True
add local_repo_dir to load config.json, model weights, and voices/ from a local directory
keep Hugging Face download as the default behavior when local_repo_dir is not provided

Why:

weight_norm is useful for training, but it adds extra inference overhead
some deployments already have model assets on disk and do not want to depend on download-time resolution

Result on my setup:

GPU memory reduced by about 28% (1016 MB -> 724 MB in my test)

…pping

JINZIPING added 3 commits April 5, 2026 17:51

Feature/add local repo loading and default inference weight_norm stri…

d508bca

…pping

support mixed zh&en

55348f1

Readme update

59c1100

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add local repo loading and strip default inference weight normalization#311

Add local repo loading and strip default inference weight normalization#311
JINZIPING wants to merge 3 commits into
hexgrad:mainfrom
JINZIPING:main

JINZIPING commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

JINZIPING commented Apr 5, 2026

Title

Description

What changed:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant