Fix GPT-QModel compat and deprecate AutoGPTQ#2426
Fix GPT-QModel compat and deprecate AutoGPTQ#2426Qubitium wants to merge 9 commits intohuggingface:mainfrom
Conversation
|
Thanks for working on this quickly. As I'm not very knowledgeable about optimum, let's tag someone else for review @IlyasMoutawwakil |
|
@IlyasMoutawwakil This is continution of @jiqing-feng's work in #2420. We now pin the gpt-qmodel version so we can hard deprecate lots of old code and assumptions as the existing code was first written for |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| with patch("optimum.gptq.quantizer.hf_convert_gptq_v1_to_v2_format", return_value=(model, False)) as convert_mock: | ||
| with patch("optimum.gptq.quantizer.gptq_post_init", return_value=model) as post_init_mock: |
There was a problem hiding this comment.
what are these patches for ?
There was a problem hiding this comment.
@IlyasMoutawwakil GPTQPostInitTest has been fixed. It was not actually testing what we actually need it to test/verify: that loaded qunatized layer correctly executes post_init as gptqmodel requires. It was also accessing private gptqmodel states such as optimized which Optimum should not touch as it is private/internal.
update test code: 9958aea
bb98e89 to
9adc203
Compare
9adc203 to
e5d3fdc
Compare
| uv pip install .[tests] | ||
| uv pip install pypcre "setuptools>=78.1.1,<82" | ||
| uv pip install "gptqmodel>=5.6.12" --no-build-isolation | ||
| uv pip install "gptqmodel>=7.0.0" |
There was a problem hiding this comment.
@IlyasMoutawwakil GPT-QModel now pinned to just released 7.0.0 version which moved all kernel compile from setup stage to first-use JIT so the flag is not required now: --no-build-isolation
What does this PR do?
1,. Fix ModelCloud/GPTQModel#2818
2. Refractor away specific kernel tests. GPT-QModel will auto select (by default) the best kernel for the situation so Optimum should not test for specific kernels as the auto logic evolves over time.
3. Remove now dangling (useless)
max_input_lengtharg4. Slight rename of GPTQModel to GPT-QModel. (gpq-qmodel is no longer gptq specific pkg)
5. Fully deprecate
AutoGPTQ(has been archive mode since March 2025 and i was the last commiter/maintainer). It's time.Before submitting
Who can review?
@BenjaminBossan