model : support NVFP4 tensors for Gemma4 by CISC · Pull Request #21971 · ggml-org/llama.cpp

CISC · 2026-04-15T22:42:11Z

Overview

Add support for NVFP4 Gemma4.

Additional information

Also adds wo_s to build_attn to be able to pass it on to build_lora_mm.
GGUF: CISCai/gemma-4-31B-it-NVFP4-turbo-GGUF

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: nu

ggerganov

For consistency, these should be removed and use wq_b, wk_b, wv_b and wqkv_b instead:

llama.cpp/src/llama-model.h

Lines 256 to 263 in b1be68e

    
           // attention bias 
        
           struct ggml_tensor * bq   = nullptr; 
        
           struct ggml_tensor * bk   = nullptr; 
        
           struct ggml_tensor * bv   = nullptr; 
        
           struct ggml_tensor * bo   = nullptr; 
        
           struct ggml_tensor * bqkv = nullptr;

In a follow-up PR

ggerganov · 2026-04-16T07:25:06Z

        if (arch == LLM_ARCH_GLM4 || arch == LLM_ARCH_GLM4_MOE) {
            // GLM4 and GLM4_MOE seem to have numerical issues with half-precision accumulators
+            cur = build_lora_mm(wo, cur);
            ggml_mul_mat_set_prec(cur, GGML_PREC_F32);
+            if (wo_s) {
+                cur = ggml_mul(ctx0, cur, wo_s);
+            }
+        } else {
+            cur = build_lora_mm(wo, cur, wo_s);


Maybe a follow-up PR to fix the order of the build_lora_mm arguments (e.g. cur, wo, wo_s) and add an optional precision argument to avoid this branching.

Yes, will be more manageable after merging #21245

CISC · 2026-04-16T10:55:21Z

@ngxson gentle ping, merge this, then build_qkv, then yours?

ngxson

yes, sounds good to me

* support nvfp4 tensors for Gemma4 * add wo_s to build_attn * add wo_s to build_attn * fix glm4

CISC added 3 commits April 15, 2026 22:28

support nvfp4 tensors for Gemma4

0a40e49

add wo_s to build_attn

75f3e23

add wo_s to build_attn

69a2478

CISC requested a review from ggerganov April 15, 2026 22:46

CISC mentioned this pull request Apr 15, 2026

model: using single llm_build per arch #21970

Merged

fix glm4

8ece3eb

CISC requested a review from ngxson April 15, 2026 23:29

github-actions bot added the model Model specific label Apr 15, 2026

ggerganov approved these changes Apr 16, 2026

View reviewed changes

ngxson approved these changes Apr 16, 2026

View reviewed changes

CISC merged commit f772f6e into master Apr 16, 2026
50 checks passed

CISC deleted the cisc/gemma4-nvfp4 branch April 16, 2026 14:51

ngxson mentioned this pull request Apr 16, 2026

model: move load_hparams and load_tensors to per-model definition #22004

Open

6 tasks

cnsiva pushed a commit to saas-home/llama.cpp that referenced this pull request Apr 17, 2026

model : support NVFP4 tensors for Gemma4 (ggml-org#21971)

598dfde

* support nvfp4 tensors for Gemma4 * add wo_s to build_attn * add wo_s to build_attn * fix glm4

CISC mentioned this pull request Apr 18, 2026

model : refactor bias tensor names #22079

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model : support NVFP4 tensors for Gemma4#21971

model : support NVFP4 tensors for Gemma4#21971
CISC merged 4 commits intomasterfrom
cisc/gemma4-nvfp4

CISC commented Apr 15, 2026 •

edited

Loading

Uh oh!

ggerganov left a comment

Uh oh!

ggerganov Apr 16, 2026

Uh oh!

CISC Apr 16, 2026

Uh oh!

CISC commented Apr 16, 2026

Uh oh!

ngxson left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


	// attention bias
	struct ggml_tensor * bq = nullptr;
	struct ggml_tensor * bk = nullptr;
	struct ggml_tensor * bv = nullptr;
	struct ggml_tensor * bo = nullptr;
	struct ggml_tensor * bqkv = nullptr;

Conversation

CISC commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Additional information

Requirements

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

ggerganov Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

CISC Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

CISC commented Apr 16, 2026

Uh oh!

ngxson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CISC commented Apr 15, 2026 •

edited

Loading