Fix server XTC: accept int params and flatten special_tokens list by realyxl · Pull Request #1301 · ml-explore/mlx-lm

realyxl · 2026-05-22T15:23:54Z

mlx_lm.server rejects integer XTC parameters and crashes whenever xtc_probability > 0. Two unrelated server-side bugs, both in server.py.

1. Strict `float` rejects integer `0` / `1`

validate_model_parameters() checks xtc_probability / xtc_threshold with float alone, while every other sampling parameter in the same block — temperature, top_p, min_p, repetition_penalty, presence_penalty, frequency_penalty — already accepts (float, int). JSON clients (e.g. SillyTavern) that emit 0 / 1 as integers get a 4xx.

The downstream apply_xtc / make_sampler have no isinstance checks; the values are only used in comparisons and MLX tensor ops, which treat int and float identically.

-        self._validate("xtc_probability", float, min_val=0, max_val=1)
-        self._validate("xtc_threshold", float, min_val=0, max_val=1)
+        self._validate("xtc_probability", (float, int), min_val=0, max_val=1)
+        self._validate("xtc_threshold", (float, int), min_val=0, max_val=1)

2. Nested `xtc_special_tokens` crashes `apply_xtc`

_make_sampler builds xtc_special_tokens as [int, list[int]] because tokenizer.encode("\n") returns a list. With xtc_probability > 0, apply_xtc does mask[..., xtc_special_tokens] = False and raises ValueError: Initialization encountered extra dimension (#1257).

generate.py:2070 and chat.py:155-157 already use the correct flat construction; this aligns server.py with them.

-        xtc_special_tokens=[
-            tokenizer.eos_token_id,
-            tokenizer.encode("\n"),
-        ],
+        xtc_special_tokens=tokenizer.encode("\n") + list(tokenizer.eos_token_ids),

This also picks up additional EOS tokens on multi-EOS tokenizers (Gemma, Llama 3), which singular eos_token_id missed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix server XTC: accept int params and flatten special_tokens list#1301

Fix server XTC: accept int params and flatten special_tokens list#1301
realyxl wants to merge 1 commit into
ml-explore:mainfrom
realyxl:fix/server-xtc-int-and-special-tokens

realyxl commented May 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

realyxl commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Strict float rejects integer 0 / 1

2. Nested xtc_special_tokens crashes apply_xtc

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

realyxl commented May 22, 2026 •

edited

Loading

1. Strict `float` rejects integer `0` / `1`

2. Nested `xtc_special_tokens` crashes `apply_xtc`