Add TurboQuant Swift support primitives by RNT56 · Pull Request #412 · ml-explore/mlx-swift

RNT56 · 2026-05-18T22:10:15Z

Summary

Adds the lower-level Swift TurboQuant support used by the LM-side KV-cache work:

packed TurboQuant tensor representation and reference codec
PolarQuant/QJL Metal codec and compressed-attention support paths
SwiftPM-built default.metallib resource for package test/runtime contexts
runtime capability probing and fallback behavior
focused TurboQuant tests, including runtime probe and packaged-metallib checks

This is opened as a draft because it is the support branch for the LM-side split and may need to move depending on the preferred boundary with MLX/custom kernels.

Validation

git submodule update --init --recursive
swift test --filter TurboQuant
- passed: 23 selected tests, 0 failures
- verifies the SwiftPM Metal library resource is built/copied and runtime TurboQuant Metal paths execute
swift test
- passed: 543 tests, 0 failures

davidkoski · 2026-05-18T23:22:53Z

Ah wait a sec -- these changes can't go here, they would need to be in mlx itself (Cmlx is mostly a submodule or files built from the submodule when I pick up a new tag).

davidkoski · 2026-05-18T23:23:37Z

But: Here is a comment on adding it to mlx:

ml-explore/mlx#3328 (comment)

davidkoski · 2026-05-18T23:25:34Z

See also #405

RNT56 · 2026-05-18T23:48:09Z

Thanks, that makes sense. I’ll stop treating #412 as mergeable in this form.

I’ll split this up:

withdraw/replace the Cmlx/MLX-kernel parts here, since those should not live in mlx-swift
keep any mlx-swift follow-up limited to packaging/resource plumbing only, if that is useful without changing generated/submodule-owned Cmlx code
keep the KV-cache integration work in mlx-swift-lm decoupled from this PR
treat the TurboQuant kernel path as either a custom extension/custom kernel path, or something that waits for the generic quantized SDPA direction in mlx

I’ll update/close this PR accordingly.

RNT56 · 2026-05-19T00:02:40Z

Follow-up: I split the lower-level SwiftPM metallib loader issue into ml-explore/mlx#3562, since that part belongs in MLX rather than this repo.

Antigravity added 15 commits May 19, 2026 00:07

Add TurboQuant packed tensor API

33c7f25

Add TurboQuant reference backend contract

9414147

Add TurboQuant Metal codec kernels

f76d1b0

Improve TurboQuant residual quality gates

93c8793

Add TurboQuant compressed attention kernels

fdaa297

Constrain TurboQuant online fused attention

39ff33d

Add v3 TurboQuant tiled rotating attention

d97889f

Update TurboQuant tiled attention tests

b4e6e70

Harden TurboQuant availability and shape contracts

618ae5b

Harden TurboQuant Metal template seeds

2265089

Add TurboQuant runtime capability probe

0fa2c23

Refine TurboQuant sustained profile selection

dc3772a

Harden TurboQuant Metal runtime validation

e8c5089

Fix Metal fallback and linalg norm completeness

ff54c5c

Keep TurboQuant support branch scoped

1aca540

RNT56 mentioned this pull request May 18, 2026

Interest in TurboQuant / rotating quantized KV cache support? ml-explore/mlx-swift-lm#294

Open

Antigravity added 8 commits May 19, 2026 00:38

Build SwiftPM Metal library resource

2f0928f

Guard Metal 4 family probe on simulator

65018c9

Support UInt32 custom kernel template args

0eeac08

Default TurboQuant Metal kernels to GPU stream

53e5d79

Complete TurboQuant Metal kernels

2d30bc2

Complete TurboQuant value codec support

b5643c9

Implement TurboQuantProd Metal key path

d80aa42

Support split-dimension TurboQuant attention

2596d3e

Harden TurboQuant runtime resource tests

2bf4254

RNT56 closed this May 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TurboQuant Swift support primitives#412

Add TurboQuant Swift support primitives#412
RNT56 wants to merge 24 commits into
ml-explore:mainfrom
RNT56:pr/turboquant-swift-support

RNT56 commented May 18, 2026 •

edited

Loading

Uh oh!

davidkoski commented May 18, 2026 •

edited

Loading

Uh oh!

davidkoski commented May 18, 2026

Uh oh!

davidkoski commented May 18, 2026

Uh oh!

RNT56 commented May 18, 2026

Uh oh!

RNT56 commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RNT56 commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

davidkoski commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidkoski commented May 18, 2026

Uh oh!

davidkoski commented May 18, 2026

Uh oh!

RNT56 commented May 18, 2026

Uh oh!

RNT56 commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RNT56 commented May 18, 2026 •

edited

Loading

davidkoski commented May 18, 2026 •

edited

Loading