Skip to content

Add TurboQuant Swift support primitives#412

Closed
RNT56 wants to merge 24 commits into
ml-explore:mainfrom
RNT56:pr/turboquant-swift-support
Closed

Add TurboQuant Swift support primitives#412
RNT56 wants to merge 24 commits into
ml-explore:mainfrom
RNT56:pr/turboquant-swift-support

Conversation

@RNT56
Copy link
Copy Markdown

@RNT56 RNT56 commented May 18, 2026

Summary

Adds the lower-level Swift TurboQuant support used by the LM-side KV-cache work:

  • packed TurboQuant tensor representation and reference codec
  • PolarQuant/QJL Metal codec and compressed-attention support paths
  • SwiftPM-built default.metallib resource for package test/runtime contexts
  • runtime capability probing and fallback behavior
  • focused TurboQuant tests, including runtime probe and packaged-metallib checks

This is opened as a draft because it is the support branch for the LM-side split and may need to move depending on the preferred boundary with MLX/custom kernels.

Validation

  • git submodule update --init --recursive
  • swift test --filter TurboQuant
    • passed: 23 selected tests, 0 failures
    • verifies the SwiftPM Metal library resource is built/copied and runtime TurboQuant Metal paths execute
  • swift test
    • passed: 543 tests, 0 failures

@davidkoski
Copy link
Copy Markdown
Collaborator

davidkoski commented May 18, 2026

Ah wait a sec -- these changes can't go here, they would need to be in mlx itself (Cmlx is mostly a submodule or files built from the submodule when I pick up a new tag).

@davidkoski
Copy link
Copy Markdown
Collaborator

But: Here is a comment on adding it to mlx:

ml-explore/mlx#3328 (comment)

@davidkoski
Copy link
Copy Markdown
Collaborator

See also #405

@RNT56
Copy link
Copy Markdown
Author

RNT56 commented May 18, 2026

Thanks, that makes sense. I’ll stop treating #412 as mergeable in this form.

I’ll split this up:

  • withdraw/replace the Cmlx/MLX-kernel parts here, since those should not live in mlx-swift
  • keep any mlx-swift follow-up limited to packaging/resource plumbing only, if that is useful without changing generated/submodule-owned Cmlx code
  • keep the KV-cache integration work in mlx-swift-lm decoupled from this PR
  • treat the TurboQuant kernel path as either a custom extension/custom kernel path, or something that waits for the generic quantized SDPA direction in mlx

I’ll update/close this PR accordingly.

@RNT56 RNT56 closed this May 18, 2026
@RNT56
Copy link
Copy Markdown
Author

RNT56 commented May 19, 2026

Follow-up: I split the lower-level SwiftPM metallib loader issue into ml-explore/mlx#3562, since that part belongs in MLX rather than this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants