Include CPU ISA hash in Warp kernel cache key#2452
Include CPU ISA hash in Warp kernel cache key#2452adenzler-nvidia wants to merge 2 commits intonewton-physics:mainfrom
Conversation
Warp 1.13+ compiles CPU kernels with -march=native, which emits instructions specific to the compiling CPU. GitHub Actions runners vary in CPU model (Intel Ice Lake, AMD EPYC, etc.), so restoring a kernel cache built on one CPU onto a runner with a different ISA causes illegal-instruction crashes. Add a lightweight Python script that detects the host CPU's ISA feature set (via the system C compiler on x86, /proc/cpuinfo on ARM Linux, sysctl on ARM macOS) and prints a stable hash. Include this hash in the cache key so kernels are only reused on runners with matching instruction sets.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
Marking as draft — waiting on the upstream Warp fix that adds CPU ISA-aware module hashing and load-time feature validation directly in Warp. Once that ships, we should rework this PR to use Warp's own CPU feature detection for the cache key instead of rolling our own detection script. |
|
@shi-eric we won't need this anymore with your upstream changes, right? |
In principle we shouldn’t. Can you test a few times by repeatedly triggering a pr that updates Warp? Note that you can’t update to the bleeding edge nightly due to the issue I mentioned on Slack. Have to choose a nightly from a few days ago. |
| except FileNotFoundError: | ||
| pass | ||
|
|
||
| # macOS: sysctl exposes CPU features. |
There was a problem hiding this comment.
🟡 machdep.cpu.features only exists on Intel Macs, so on Apple Silicon runners the sysctl -n machdep.cpu.features call below raises CalledProcessError, _aarch64_features() returns "", and main() falls back to platform.processor() (which is "arm" on Apple Silicon). On macos-latest the cache key ends up not being derived from ISA features at all, so that runner effectively keeps the pre-change behavior and one of the four CI targets does not benefit from this change.
The fallback itself is safe (no crash, stable hash), but the module docstring on line 15 advertises "AArch64 macOS (via sysctl)" support that is not actually working on Apple Silicon.
On Apple Silicon, sysctl hw.optional enumerates per-feature flags such as hw.optional.neon and hw.optional.armv8_2_sha3. Parsing and sorting those keys gives a real ISA fingerprint.
Example macOS ARM branch
# macOS Apple Silicon: hw.optional.* enumerates ISA features.
try:
out = subprocess.check_output(
["sysctl", "-a"],
text=True,
stderr=subprocess.DEVNULL,
)
features = sorted(
line.split(":", 1)[0].strip()
for line in out.splitlines()
if line.startswith("hw.optional.") and line.rstrip().endswith(": 1")
)
if features:
return " ".join(features)
except (FileNotFoundError, subprocess.CalledProcessError):
passWorth updating the docstring once the macOS ARM path actually contributes ISA features to the hash.
Summary
-march=native, which emits instructionsspecific to the compiling CPU. GitHub Actions runners vary in CPU model (Intel
Ice Lake, AMD EPYC, etc.), so restoring a kernel cache built on one CPU onto a
runner with a different ISA causes illegal-instruction crashes.
scripts/ci/cpu_isa_hash.pythat detects the host CPU's ISA feature setand prints a stable 16-char hex hash. Include this hash in the CI cache key so
kernels are only reused on runners with matching instruction sets.
(via
/proc/cpuinfo), and AArch64 macOS (viasysctl).Context
After the Warp 1.13 dev nightly bump (#2427), CI started hitting
Fatal Python error: Illegal instruction/0xc000001don both Windows and Ubuntu runners.The root cause: Warp 1.13 added a
cpu_compiler_flagsoption that defaults to-march=native, causing its bundled LLVM to emit CPU-specific instructions.When the GH Actions cache restores kernel objects compiled on e.g. an Intel Ice
Lake runner (with AVX-512) onto an AMD EPYC runner (without AVX-512), the
process crashes on the first kernel invocation.
The previous cache key (
warp-kernels-OS-ARCH-<code-hash>) did not account forCPU differences, and the
restore-keysprefix fallback made cross-CPU cachereuse likely.
Test plan
cpu-idstep runs and prints a hash on all four matrix runners(ubuntu-latest, ubuntu-24.04-arm, windows-latest, macos-latest)
(e.g.
warp-kernels-Linux-X64-0723c9b174ec6c08-...)