Fix GPT-QModel compat and deprecate AutoGPTQ by Qubitium · Pull Request #2426 · huggingface/optimum

Qubitium · 2026-04-23T14:50:25Z

What does this PR do?

1,. Fix ModelCloud/GPTQModel#2818
2. Refractor away specific kernel tests. GPT-QModel will auto select (by default) the best kernel for the situation so Optimum should not test for specific kernels as the auto logic evolves over time.
3. Remove now dangling (useless) max_input_length arg
4. Slight rename of GPTQModel to GPT-QModel. (gpq-qmodel is no longer gptq specific pkg)
5. Fully deprecate AutoGPTQ (has been archive mode since March 2025 and i was the last commiter/maintainer). It's time.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

@BenjaminBossan

BenjaminBossan · 2026-04-23T15:04:55Z

Thanks for working on this quickly. As I'm not very knowledgeable about optimum, let's tag someone else for review @IlyasMoutawwakil

Qubitium · 2026-04-23T15:07:56Z

@IlyasMoutawwakil This is continution of @jiqing-feng's work in #2420.

We now pin the gpt-qmodel version so we can hard deprecate lots of old code and assumptions as the existing code was first written for autogptq and longer applies to gpt-qmodel.

HuggingFaceDocBuilderDev · 2026-04-23T15:22:47Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

IlyasMoutawwakil · 2026-04-24T06:17:02Z

+        with patch("optimum.gptq.quantizer.hf_convert_gptq_v1_to_v2_format", return_value=(model, False)) as convert_mock:
+            with patch("optimum.gptq.quantizer.gptq_post_init", return_value=model) as post_init_mock:


what are these patches for ?

@IlyasMoutawwakil GPTQPostInitTest has been fixed. It was not actually testing what we actually need it to test/verify: that loaded qunatized layer correctly executes post_init as gptqmodel requires. It was also accessing private gptqmodel states such as optimized which Optimum should not touch as it is private/internal.

update test code: 9958aea

Qubitium · 2026-04-28T22:01:26Z

          uv pip install .[tests]
          uv pip install pypcre "setuptools>=78.1.1,<82"
-          uv pip install "gptqmodel>=5.6.12" --no-build-isolation
+          uv pip install "gptqmodel>=7.0.0"


@IlyasMoutawwakil GPT-QModel now pinned to just released 7.0.0 version which moved all kernel compile from setup stage to first-use JIT so the flag is not required now: --no-build-isolation

Qubitium added 5 commits April 23, 2026 14:34

remove references to deprecated exllama v1 kernel

d930e8a

remove max_input_length from gptq api

9c9a122

remove exllamav2-specific gptq docs and tests

32d915b

rename gptqmodel references to GPTQ-Model in docs

4b030cc

restore gptq serialization test coverage

20ecef1

Qubitium marked this pull request as ready for review April 23, 2026 15:00

remove autogptq support from gptq integration

bb43ae2

Qubitium changed the title ~~Fix gptqmodel compat~~ Fix gptqmodel compat and deprecate AutoGPTQ Apr 23, 2026

fix gptq formatting

d87d80f

Qubitium mentioned this pull request Apr 23, 2026

Update GPT-QModel references and deprecate AutoGPTQ huggingface/peft#3190

Merged

IlyasMoutawwakil reviewed Apr 24, 2026

View reviewed changes

test real gptq post-init path

9958aea

Qubitium requested a review from IlyasMoutawwakil April 25, 2026 19:18

Qubitium changed the title ~~Fix gptqmodel compat and deprecate AutoGPTQ~~ Fix GPT-QModel compat and deprecate AutoGPTQ Apr 28, 2026

Qubitium force-pushed the fix-gptqmodel-compat branch from bb98e89 to 9adc203 Compare April 28, 2026 21:57

update version

e5d3fdc

Qubitium force-pushed the fix-gptqmodel-compat branch from 9adc203 to e5d3fdc Compare April 28, 2026 21:59

Qubitium commented Apr 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix GPT-QModel compat and deprecate AutoGPTQ#2426

Fix GPT-QModel compat and deprecate AutoGPTQ#2426
Qubitium wants to merge 9 commits intohuggingface:mainfrom
Qubitium:fix-gptqmodel-compat

Qubitium commented Apr 23, 2026 •

edited

Loading

Uh oh!

BenjaminBossan commented Apr 23, 2026

Uh oh!

Qubitium commented Apr 23, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 23, 2026

Uh oh!

IlyasMoutawwakil Apr 24, 2026

Uh oh!

Qubitium Apr 25, 2026

Uh oh!

Qubitium Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		with patch("optimum.gptq.quantizer.hf_convert_gptq_v1_to_v2_format", return_value=(model, False)) as convert_mock:
		with patch("optimum.gptq.quantizer.gptq_post_init", return_value=model) as post_init_mock:

Conversation

Qubitium commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

BenjaminBossan commented Apr 23, 2026

Uh oh!

Qubitium commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 23, 2026

Uh oh!

IlyasMoutawwakil Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Qubitium Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Qubitium Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Qubitium commented Apr 23, 2026 •

edited

Loading

Qubitium commented Apr 23, 2026 •

edited

Loading