Skip to content

Add local repo loading and strip default inference weight normalization#311

Open
JINZIPING wants to merge 3 commits into
hexgrad:mainfrom
JINZIPING:main
Open

Add local repo loading and strip default inference weight normalization#311
JINZIPING wants to merge 3 commits into
hexgrad:mainfrom
JINZIPING:main

Conversation

@JINZIPING
Copy link
Copy Markdown

Title

Reduce inference GPU memory and add optional local model loading

Description

This PR reduces GPU memory usage by stripping weight_norm parametrizations after model weights are loaded, and adds an optional parameter for loading model files and voices from a local repo-shaped directory instead of downloading from Hugging Face.

What changed:

  • strip weight_norm by default in inference mode after loading weights
  • keep an escape hatch with for_training=True
  • add local_repo_dir to load config.json, model weights, and voices/ from a local directory
  • keep Hugging Face download as the default behavior when local_repo_dir is not provided

Why:

  • weight_norm is useful for training, but it adds extra inference overhead
  • some deployments already have model assets on disk and do not want to depend on download-time resolution

Result on my setup:

  • GPU memory reduced by about 28% (1016 MB -> 724 MB in my test)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant