-
Notifications
You must be signed in to change notification settings - Fork 688
Pull requests: ml-explore/mlx-lm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
BatchGenerator: opt-in prefer_prefill_when_pending scheduler
#1288
opened May 19, 2026 by
benjamin-levin
Loading…
Fix nemotron_h MoEGate breaking load with per-path quantization
#1282
opened May 18, 2026 by
YBJ0000
Loading…
fix: make generation_stream per-thread to fix server crash on worker threads
#1275
opened May 14, 2026 by
nish2292
Loading…
4 tasks done
feat: add --idle-timeout to unload model after inactivity
#1274
opened May 14, 2026 by
nish2292
Loading…
8 tasks done
fix(server): wire --prompt-cache-bytes CLI flag to LRUPromptCache
#1267
opened May 11, 2026 by
andreinknv
Loading…
feat(server): add /v1/embeddings route via mlx_embeddings
#1265
opened May 9, 2026 by
andreinknv
Loading…
Fix ArraysCache missing is_trimmable/trim for hybrid model prompt cache
#1254
opened May 6, 2026 by
EagerofLight
Loading…
Fix BatchRotatingKVCache rotated flag deserializing to True
#1251
opened May 6, 2026 by
odysa
Loading…
Fix mlx_lm.server --adapter-path silently ignored at startup
#1249
opened May 6, 2026 by
odysa
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2026-05-17.