Skip to content

Fix GPT-QModel compat and deprecate AutoGPTQ#2426

Open
Qubitium wants to merge 9 commits intohuggingface:mainfrom
Qubitium:fix-gptqmodel-compat
Open

Fix GPT-QModel compat and deprecate AutoGPTQ#2426
Qubitium wants to merge 9 commits intohuggingface:mainfrom
Qubitium:fix-gptqmodel-compat

Conversation

@Qubitium
Copy link
Copy Markdown
Contributor

@Qubitium Qubitium commented Apr 23, 2026

What does this PR do?

1,. Fix ModelCloud/GPTQModel#2818
2. Refractor away specific kernel tests. GPT-QModel will auto select (by default) the best kernel for the situation so Optimum should not test for specific kernels as the auto logic evolves over time.
3. Remove now dangling (useless) max_input_length arg
4. Slight rename of GPTQModel to GPT-QModel. (gpq-qmodel is no longer gptq specific pkg)
5. Fully deprecate AutoGPTQ (has been archive mode since March 2025 and i was the last commiter/maintainer). It's time.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

@BenjaminBossan

@Qubitium Qubitium marked this pull request as ready for review April 23, 2026 15:00
@BenjaminBossan
Copy link
Copy Markdown
Member

Thanks for working on this quickly. As I'm not very knowledgeable about optimum, let's tag someone else for review @IlyasMoutawwakil

@Qubitium
Copy link
Copy Markdown
Contributor Author

Qubitium commented Apr 23, 2026

@IlyasMoutawwakil This is continution of @jiqing-feng's work in #2420.

We now pin the gpt-qmodel version so we can hard deprecate lots of old code and assumptions as the existing code was first written for autogptq and longer applies to gpt-qmodel.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Qubitium Qubitium changed the title Fix gptqmodel compat Fix gptqmodel compat and deprecate AutoGPTQ Apr 23, 2026
Comment thread tests/gptq/test_quantization.py Outdated
Comment on lines +230 to +231
with patch("optimum.gptq.quantizer.hf_convert_gptq_v1_to_v2_format", return_value=(model, False)) as convert_mock:
with patch("optimum.gptq.quantizer.gptq_post_init", return_value=model) as post_init_mock:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are these patches for ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IlyasMoutawwakil GPTQPostInitTest has been fixed. It was not actually testing what we actually need it to test/verify: that loaded qunatized layer correctly executes post_init as gptqmodel requires. It was also accessing private gptqmodel states such as optimized which Optimum should not touch as it is private/internal.

update test code: 9958aea

@Qubitium Qubitium changed the title Fix gptqmodel compat and deprecate AutoGPTQ Fix GPT-QModel compat and deprecate AutoGPTQ Apr 28, 2026
@Qubitium Qubitium force-pushed the fix-gptqmodel-compat branch from bb98e89 to 9adc203 Compare April 28, 2026 21:57
@Qubitium Qubitium force-pushed the fix-gptqmodel-compat branch from 9adc203 to e5d3fdc Compare April 28, 2026 21:59
uv pip install .[tests]
uv pip install pypcre "setuptools>=78.1.1,<82"
uv pip install "gptqmodel>=5.6.12" --no-build-isolation
uv pip install "gptqmodel>=7.0.0"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IlyasMoutawwakil GPT-QModel now pinned to just released 7.0.0 version which moved all kernel compile from setup stage to first-use JIT so the flag is not required now: --no-build-isolation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] type object 'BACKEND' has no attribute 'EXLLAMA_V1'

4 participants