Skip to content

Pull requests: ml-explore/mlx-lm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix KeyError: 'name' in qwen3_coder tool parser
#1289 opened May 19, 2026 by DShickle Loading…
Fix tokenizer test failure
#1287 opened May 19, 2026 by zcbenz Collaborator Loading…
[mlx_lm] Expose 'strict' parameter in load() function
#1284 opened May 18, 2026 by zyguy Loading…
Add per-request prompt cache files to server
#1283 opened May 18, 2026 by Quiet-Node-io Loading…
Add timings to server responses
#1279 opened May 16, 2026 by spicyneuron Contributor Loading…
Restrict think-state scan to assistant prefill tail
#1277 opened May 15, 2026 by eilidhmae Loading…
Add Gemma 4 assistant (MTP drafter) model class
#1276 opened May 14, 2026 by broomva Loading…
feat: add --idle-timeout to unload model after inactivity
#1274 opened May 14, 2026 by nish2292 Loading…
8 tasks done
Add logits processor arguments to mlx_lm.generate
#1273 opened May 13, 2026 by realyxl Loading…
Support max_kv_size configuration in HTTP server
#1272 opened May 13, 2026 by r-bahuguna Loading…
Add Olmo3 tool parser
#1271 opened May 11, 2026 by anthonyhchan Loading…
2 tasks done
docs: add Qwen3 QLoRA Apple Silicon example
#1270 opened May 11, 2026 by SysCd Loading…
Add Granite tool parser
#1264 opened May 9, 2026 by jonpspri Loading…
Add Responses API support
#1263 opened May 9, 2026 by blairhudson Loading…
Support for Zyphra/ZAYA1-base
#1261 opened May 9, 2026 by kyr0 Loading…
Fix LFM2.5 tool parser inference
#1260 opened May 8, 2026 by blairhudson Loading…
Fix server XTC crash from heterogeneous xtc_special_tokens
#1258 opened May 7, 2026 by odysa Draft
3 of 5 tasks
ProTip! Updated in the last three days: updated:>2026-05-17.