Skip to content

[autotune] Add observed heuristic seeds#2378

Draft
choijon5 wants to merge 2 commits into
mainfrom
choijon5/observed-autotune-heuristics
Draft

[autotune] Add observed heuristic seeds#2378
choijon5 wants to merge 2 commits into
mainfrom
choijon5/observed-autotune-heuristics

Conversation

@choijon5
Copy link
Copy Markdown
Contributor

No description provided.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 10, 2026
Extends observed-heuristics with:

1. Three quantized-matmul kernel classes: matmul_int4, matmul_int16,
   matmul_fp4. classify_runtime_kernel now takes an optional kernel
   arg to fingerprint int4 vs fp4 from kernel source (their tensor
   shapes and dtypes are identical). Quantized kernels lack the
   "matmul" workload trait (manual outer-product + sum instead of
   hl.dot), so a secondary classification branch triggers on
   {"reduction","sum_reduction"} + packed-int8 signature.

2. Top-level `fallbacks` block in observed_heuristics_b200.json used
   when exact-bucket rule lookup misses. Shape-groups per family
   are returned by _fallback_group_for_class:
   - matmul: small_m / small_n / small_k / balanced / rect
   - row_*: short / narrow / wide / square
   - elementwise: tiny / mid / huge
   - attention: short_seq / long_seq / small_head / mid_seq

3. JSON now has 30 rules (9 existing attention/matmul/row_softmax
   + 21 new quantized-GEMM) and 15 fallback entries (5 groups * 3
   quantized kernels). Quantized rules are keyed on the matmul
   bucketing scheme already used for regular matmul, with the
   dtype slot separating the kernels (int4/fp4/int16 weight
   operand all produce arg0_dtype=bf16 -> dtype_family=fp16_bf16).

Pre-existing attention/matmul/row_softmax rules resolve identically.
test_observed_heuristics.py: all 4 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@choijon5 choijon5 force-pushed the choijon5/observed-autotune-heuristics branch from 871d0a3 to bf16ebc Compare May 11, 2026 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant