Skip to content

Keep routing replay target prep out of Dynamo#723

Merged
FurtherAI merged 1 commit into
mainfrom
austin/routing_replay_compile_disable_main
Jun 8, 2026
Merged

Keep routing replay target prep out of Dynamo#723
FurtherAI merged 1 commit into
mainfrom
austin/routing_replay_compile_disable_main

Conversation

@FurtherAI

Copy link
Copy Markdown
Collaborator

Summary

  • keep MoE routing replay target preparation outside TorchDynamo with a narrow torch.compiler.disable boundary
  • preserve compiled execution for the original Megatron routing call

Why

Routing replay target preparation mutates Python replay cursors and Megatron RouterReplay state. When this runs inside a compiled router frame, Dynamo guards on changing replay state and recompiles across router/lifecycle calls.

Validation

  • uv sync --all-extras
  • uv run ruff check src/art/megatron/routing_replay.py
  • uv run ruff format --check src/art/megatron/routing_replay.py
  • uv run ty check src/art/megatron/routing_replay.py
  • commit hooks passed: ruff, ruff format, ty, uv.lock sync check

Additional local investigation: the shared 3-step Qwen3.5 Megatron smoke that reproduced the SIGSEGV on PR720+721 passed with this boundary applied, and the routing-replay Dynamo signatures disappeared.

@FurtherAI FurtherAI merged commit ebdb726 into main Jun 8, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant