ml-explore / mlx-lm Public

Notifications You must be signed in to change notification settings
Fork 688
Star 5.4k

Code
Issues 140
Pull requests 157
Discussions
Actions
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security and quality
Insights

Pull requests: ml-explore/mlx-lm

Labels 9 Milestones 0

New pull request New

157 Open 646 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix KeyError: 'name' in qwen3_coder tool parser

#1289 opened May 19, 2026 by DShickle

Loading…

BatchGenerator: opt-in prefer_prefill_when_pending scheduler

#1288 opened May 19, 2026 by benjamin-levin

Loading…

Fix tokenizer test failure

#1287 opened May 19, 2026 by zcbenz Collaborator

Loading…

[mlx_lm] Expose 'strict' parameter in load() function

#1284 opened May 18, 2026 by zyguy

Loading…

Add per-request prompt cache files to server

#1283 opened May 18, 2026 by Quiet-Node-io

Loading…

Fix nemotron_h MoEGate breaking load with per-path quantization

#1282 opened May 18, 2026 by YBJ0000

Loading…

Add timings to server responses

#1279 opened May 16, 2026 by spicyneuron Contributor

Loading…

Restrict think-state scan to assistant prefill tail

#1277 opened May 15, 2026 by eilidhmae

Loading…

Add Gemma 4 assistant (MTP drafter) model class

#1276 opened May 14, 2026 by broomva

Loading…

fix: make generation_stream per-thread to fix server crash on worker threads

#1275 opened May 14, 2026 by nish2292

Loading…

4 tasks done

feat: add --idle-timeout to unload model after inactivity

#1274 opened May 14, 2026 by nish2292

Loading…

8 tasks done

Add logits processor arguments to mlx_lm.generate

#1273 opened May 13, 2026 by realyxl

Loading…

Support max_kv_size configuration in HTTP server

#1272 opened May 13, 2026 by r-bahuguna

Loading…

Add Olmo3 tool parser

#1271 opened May 11, 2026 by anthonyhchan

Loading…

2 tasks done

docs: add Qwen3 QLoRA Apple Silicon example

#1270 opened May 11, 2026 by SysCd

Loading…

fix(server): wire --prompt-cache-bytes CLI flag to LRUPromptCache

#1267 opened May 11, 2026 by andreinknv

Loading…

feat(server): add /v1/embeddings route via mlx_embeddings

#1265 opened May 9, 2026 by andreinknv

Loading…

Add Granite tool parser

#1264 opened May 9, 2026 by jonpspri

Loading…

Add Responses API support

#1263 opened May 9, 2026 by blairhudson

Loading…

Support for Zyphra/ZAYA1-base

#1261 opened May 9, 2026 by kyr0

Loading…

Fix LFM2.5 tool parser inference

#1260 opened May 8, 2026 by blairhudson

Loading…

Fix server XTC crash from heterogeneous xtc_special_tokens

#1258 opened May 7, 2026 by odysa • Draft

3 of 5 tasks

Fix ArraysCache missing is_trimmable/trim for hybrid model prompt cache

#1254 opened May 6, 2026 by EagerofLight

Loading…

Fix BatchRotatingKVCache rotated flag deserializing to True

#1251 opened May 6, 2026 by odysa

Loading…

Fix mlx_lm.server --adapter-path silently ignored at startup

#1249 opened May 6, 2026 by odysa

Loading…

Previous 1 2 3 4 5 6 7 Next

Previous Next

ProTip! Updated in the last three days: updated:>2026-05-17.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!