[LV] Update remaining tests to use VPlan cost output (NFC). by fhahn · Pull Request #190038 · llvm/llvm-project

fhahn · 2026-04-01T19:59:01Z

Move remaining tests checking legacy cost output to check the VPlan's cost model output.

In some cases, checks become much more compact (checking a single interleave group cost vs checking the individual members which all have the group's cost). In some cases, auto-generation consistently checks all relevant VFs.

Move remaining tests checking legacy cost output to check the VPlan's cost model output. In some cases, checks become much more compact (checking a single interleave group cost vs checking the individual members which all have the group's cost). In some cases, auto-generation consistently checks all relevant VFs.

llvmbot · 2026-04-01T19:59:34Z

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-risc-v

Author: Florian Hahn (fhahn)

Changes

Move remaining tests checking legacy cost output to check the VPlan's cost model output.

In some cases, checks become much more compact (checking a single interleave group cost vs checking the individual members which all have the group's cost). In some cases, auto-generation consistently checks all relevant VFs.

Patch is 1.10 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/190038.diff

115 Files Affected:

(modified) llvm/test/Transforms/LoopVectorize/AArch64/binop-costs.ll (-6)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-zext-costs.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/reg-usage.ll (+16-12)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reduction-cost.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/WebAssembly/int-mac-reduction-costs.ll (+62-62)
(modified) llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll (+383-450)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/handle-iptr-with-data-layout-to-not-assert.ll (+4-4)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f32-stride-2.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f32-stride-3.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f32-stride-4.ll (+20-21)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f32-stride-5.ll (+20-122)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f32-stride-6.ll (+20-22)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f32-stride-7.ll (+19-165)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f32-stride-8.ll (+19-173)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f64-stride-2.ll (+19-53)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f64-stride-3.ll (+19-72)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f64-stride-4.ll (+18-85)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f64-stride-5.ll (+18-112)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f64-stride-6.ll (+18-135)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f64-stride-7.ll (+18-151)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f64-stride-8.ll (+17-157)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-half.ll (+7-3)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i16-stride-2.ll (+27-32)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i16-stride-3.ll (+27-32)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i16-stride-4.ll (+27-32)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i16-stride-5.ll (+27-176)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i16-stride-6.ll (+27-32)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i16-stride-7.ll (+27-248)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i16-stride-8.ll (+27-252)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-2-indices-0u.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-2.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-3-indices-01u.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-3-indices-0uu.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-3.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-4-indices-012u.ll (+20-24)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-4-indices-01uu.ll (+20-53)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-4-indices-0uuu.ll (+20-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-4.ll (+20-21)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-5.ll (+20-122)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-6.ll (+20-22)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-7.ll (+19-165)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i32-stride-8.ll (+19-173)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i64-stride-2.ll (+19-53)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i64-stride-3.ll (+19-72)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i64-stride-4.ll (+18-85)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i64-stride-5.ll (+18-102)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i64-stride-6.ll (+18-123)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i64-stride-7.ll (+18-151)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i64-stride-8.ll (+17-157)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i8-stride-2.ll (+27-32)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i8-stride-3.ll (+27-32)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i8-stride-4.ll (+27-32)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i8-stride-5.ll (+27-176)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i8-stride-6.ll (+27-212)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i8-stride-7.ll (+27-248)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i8-stride-8.ll (+27-284)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f32-stride-2.ll (+21-29)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f32-stride-3.ll (+21-29)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f32-stride-4.ll (+21-29)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f32-stride-5.ll (+21-26)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f32-stride-6.ll (+21-26)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f32-stride-7.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f32-stride-8.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f64-stride-2.ll (+21-29)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f64-stride-3.ll (+21-26)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f64-stride-4.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f64-stride-5.ll (+21-22)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f64-stride-6.ll (+21-22)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f64-stride-7.ll (+21-22)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f64-stride-8.ll (+33-156)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i16-stride-2.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i16-stride-3.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i16-stride-4.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i16-stride-5.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i16-stride-6.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i16-stride-7.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i16-stride-8.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i32-stride-2.ll (+21-29)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i32-stride-3.ll (+21-29)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i32-stride-4.ll (+21-29)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i32-stride-5.ll (+21-26)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i32-stride-6.ll (+21-26)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i32-stride-7.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i32-stride-8.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i64-stride-2.ll (+21-29)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i64-stride-3.ll (+21-26)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i64-stride-4.ll (+21-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i64-stride-5.ll (+21-22)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i64-stride-6.ll (+21-22)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i64-stride-7.ll (+21-22)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i64-stride-8.ll (+33-156)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i8-stride-2.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i8-stride-3.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i8-stride-4.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i8-stride-5.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i8-stride-6.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i8-stride-7.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i8-stride-8.ll (+27-37)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/masked-interleaved-load-i16.ll (+33-49)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/masked-interleaved-store-i16.ll (+34-51)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/masked-scatter-i32-with-i8-index.ll (+25-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/masked-scatter-i64-with-i8-index.ll (+26-26)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/masked-store-i16.ll (+21-21)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/masked-store-i32.ll (+25-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/masked-store-i64.ll (+25-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/masked-store-i8.ll (+25-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/scatter-i16-with-i8-index.ll (+25-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/scatter-i32-with-i8-index.ll (+25-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/scatter-i64-with-i8-index.ll (+25-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/scatter-i8-with-i8-index.ll (+25-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/strided-load-i16.ll (+20-20)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/strided-load-i32.ll (+16-16)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/strided-load-i64.ll (+9-9)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/strided-load-i8.ll (+24-24)
(modified) llvm/test/Transforms/LoopVectorize/X86/CostModel/vpinstruction-cost.ll (+20-5)

diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/binop-costs.ll b/llvm/test/Transforms/LoopVectorize/AArch64/binop-costs.ll
index ff1dee41e62bf..8e4c6d470c9be 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/binop-costs.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/binop-costs.ll
@@ -17,12 +17,6 @@ define void @udiv_rhs_opt_cost(ptr %dst) #0 {
 ; CHECK:  Cost of 0 for VF vscale x 2: IR %div = udiv i8 %iv.trunc, 3
 ; CHECK:  Cost of 5 for VF vscale x 4: CLONE ir<%div> = udiv vp<[[VP7]]>, ir<3>
 ; CHECK:  Cost of 0 for VF vscale x 4: IR %div = udiv i8 %iv.trunc, 3
-; CHECK:  LV: Found an estimated cost of 5 for VF 1 For instruction: %div = udiv i8 %iv.trunc, 3
-; CHECK:  LV: Found an estimated cost of 5 for VF 2 For instruction: %div = udiv i8 %iv.trunc, 3
-; CHECK:  LV: Found an estimated cost of 5 for VF 4 For instruction: %div = udiv i8 %iv.trunc, 3
-; CHECK:  LV: Found an estimated cost of 5 for VF vscale x 1 For instruction: %div = udiv i8 %iv.trunc, 3
-; CHECK:  LV: Found an estimated cost of 5 for VF vscale x 2 For instruction: %div = udiv i8 %iv.trunc, 3
-; CHECK:  LV: Found an estimated cost of 5 for VF vscale x 4 For instruction: %div = udiv i8 %iv.trunc, 3
 ;
 entry:
   br label %loop
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-zext-costs.ll b/llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-zext-costs.ll
index a44a16455445c..bcd1c28318450 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-zext-costs.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-zext-costs.ll
@@ -8,6 +8,7 @@ target triple = "aarch64-unknown-linux-gnu"
 
 define void @zext_i8_i16(ptr noalias nocapture readonly %p, ptr noalias nocapture %q, i32 %len) #0 {
 ; CHECK-COST-LABEL: LV: Checking a loop in 'zext_i8_i16'
+; CHECK-COST: LV: Found an estimated cost of 0 for VF 1 For instruction:   %conv = zext i8 %0 to i32
 ; CHECK-COST: Cost of 1 for VF 2: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
 ; CHECK-COST: Cost of 1 for VF 4: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
 ; CHECK-COST: Cost of 1 for VF 8: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
@@ -16,7 +17,6 @@ define void @zext_i8_i16(ptr noalias nocapture readonly %p, ptr noalias nocaptur
 ; CHECK-COST: Cost of 1 for VF vscale x 2: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
 ; CHECK-COST: Cost of 1 for VF vscale x 4: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
 ; CHECK-COST: Cost of 0 for VF vscale x 8: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
-; CHECK-COST: LV: Found an estimated cost of 0 for VF 1 For instruction:   %conv = zext i8 %0 to i32
 ; CHECK-LABEL: define void @zext_i8_i16
 ; CHECK-SAME: (ptr noalias readonly captures(none) [[P:%.*]], ptr noalias captures(none) [[Q:%.*]], i32 [[LEN:%.*]]) #[[ATTR0:[0-9]+]] {
 ; CHECK-NEXT:  entry:
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage.ll b/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage.ll
index 99139da67bb78..b0738cad80064 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage.ll
@@ -84,22 +84,26 @@ define void @goo(ptr nocapture noundef %a, i32 noundef signext %n) {
 ; CHECK-SCALAR:      LV(REG): VF = 1
 ; CHECK-SCALAR-NEXT: LV(REG): Found max usage: 1 item
 ; CHECK-SCALAR-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 3 registers
-; CHECK-LMUL1:       LV(REG): VF = vscale x 2
+; CHECK-LMUL1-LABEL: goo
+; CHECK-LMUL1:       LV(REG): VF = vscale x 1
 ; CHECK-LMUL1-NEXT:  LV(REG): Found max usage: 2 item
-; CHECK-LMUL1-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 6 registers
-; CHECK-LMUL1-NEXT:  LV(REG): RegisterClass: RISCV::VRRC, 2 registers
-; CHECK-LMUL2:       LV(REG): VF = vscale x 4
+; CHECK-LMUL1-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 5 registers
+; CHECK-LMUL1-NEXT:  LV(REG): RegisterClass: RISCV::VRRC, 1 registers
+; CHECK-LMUL2-LABEL: goo
+; CHECK-LMUL2:       LV(REG): VF = vscale x 2
 ; CHECK-LMUL2-NEXT:  LV(REG): Found max usage: 2 item
-; CHECK-LMUL2-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 6 registers
-; CHECK-LMUL2-NEXT:  LV(REG): RegisterClass: RISCV::VRRC, 4 registers
-; CHECK-LMUL4:       LV(REG): VF = vscale x 8
+; CHECK-LMUL2-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 5 registers
+; CHECK-LMUL2-NEXT:  LV(REG): RegisterClass: RISCV::VRRC, 2 registers
+; CHECK-LMUL4-LABEL: goo
+; CHECK-LMUL4:       LV(REG): VF = vscale x 4
 ; CHECK-LMUL4-NEXT:  LV(REG): Found max usage: 2 item
-; CHECK-LMUL4-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 6 registers
-; CHECK-LMUL4-NEXT:  LV(REG): RegisterClass: RISCV::VRRC, 8 registers
-; CHECK-LMUL8:       LV(REG): VF = vscale x 16
+; CHECK-LMUL4-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 5 registers
+; CHECK-LMUL4-NEXT:  LV(REG): RegisterClass: RISCV::VRRC, 4 registers
+; CHECK-LMUL8-LABEL: goo
+; CHECK-LMUL8:       LV(REG): VF = vscale x 8
 ; CHECK-LMUL8-NEXT:  LV(REG): Found max usage: 2 item
-; CHECK-LMUL8-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 6 registers
-; CHECK-LMUL8-NEXT:  LV(REG): RegisterClass: RISCV::VRRC, 16 registers
+; CHECK-LMUL8-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 5 registers
+; CHECK-LMUL8-NEXT:  LV(REG): RegisterClass: RISCV::VRRC, 8 registers
 entry:
   %cmp3 = icmp sgt i32 %n, 0
   br i1 %cmp3, label %for.body.preheader, label %for.cond.cleanup
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reduction-cost.ll b/llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reduction-cost.ll
index 10d83f4ad125e..fe39700d1787c 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reduction-cost.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reduction-cost.ll
@@ -3,8 +3,8 @@
 ; RUN: -prefer-predicate-over-epilogue=predicate-else-scalar-epilogue \
 ; RUN: -mtriple=riscv64 -mattr=+v -S < %s 2>&1 | FileCheck %s
 
+; CHECK: Cost of 0 for VF vscale x 4: WIDEN-REDUCTION-PHI ir<%rdx> = phi
 ; CHECK: Cost of 2 for VF vscale x 4: WIDEN-INTRINSIC vp<%{{.+}}> = call llvm.vp.merge(ir<true>, ir<%add>, ir<%rdx>, vp<%{{.+}}>)
-; CHECK: LV: Found an estimated cost of 2 for VF vscale x 4 For instruction:   %rdx = phi i32 [ %start, %entry ], [ %add, %loop ]
 
 define i32 @add(ptr %a, i64 %n, i32 %start) {
 entry:
diff --git a/llvm/test/Transforms/LoopVectorize/WebAssembly/int-mac-reduction-costs.ll b/llvm/test/Transforms/LoopVectorize/WebAssembly/int-mac-reduction-costs.ll
index d23c2272d9c0d..9f824d1a963eb 100644
--- a/llvm/test/Transforms/LoopVectorize/WebAssembly/int-mac-reduction-costs.ll
+++ b/llvm/test/Transforms/LoopVectorize/WebAssembly/int-mac-reduction-costs.ll
@@ -11,17 +11,17 @@ define hidden i32 @i32_mac_s8(ptr nocapture noundef readonly %a, ptr nocapture n
 ; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction:   %conv2 = sext i8 %1 to i32
 ; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction:   %mul = mul nsw i32 %conv2, %conv
 
-; CHECK: LV: Found an estimated cost of 3 for VF 2 For instruction:   %0 = load i8, ptr %arrayidx, align 1
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction:   %conv = sext i8 %0 to i32
-; CHECK: LV: Found an estimated cost of 3 for VF 2 For instruction:   %1 = load i8, ptr %arrayidx1, align 1
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction:   %conv2 = sext i8 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %mul = mul nsw i32 %conv2, %conv
-
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction:   %0 = load i8, ptr %arrayidx, align 1
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction:   %conv = sext i8 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction:   %1 = load i8, ptr %arrayidx1, align 1
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction:   %conv2 = sext i8 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction:   %mul = mul nsw i32 %conv2, %conv
+; CHECK: Cost of 3 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv> = sext ir<%0> to i32
+; CHECK: Cost of 3 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv2> = sext ir<%1> to i32
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nsw ir<%conv2>, ir<%conv>
+
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%0> = load
+; CHECK: Cost of 1 for VF 4: WIDEN-CAST ir<%conv> = sext ir<%0> to i32
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 4: WIDEN-CAST ir<%conv2> = sext ir<%1> to i32
+; CHECK: Cost of 1 for VF 4: WIDEN ir<%mul> = mul nsw ir<%conv2>, ir<%conv>
 ; CHECK: LV: Selecting VF: 4.
 entry:
   %cmp7.not = icmp eq i32 %N, 0
@@ -55,17 +55,17 @@ define hidden i32 @i32_mac_s16(ptr nocapture noundef readonly %a, ptr nocapture
 ; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction:   %conv2 = sext i16 %1 to i32
 ; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction:   %mul = mul nsw i32 %conv2, %conv
 
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction:   %conv = sext i16 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction:   %conv2 = sext i16 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %mul = mul nsw i32 %conv2, %conv
-
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction:   %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 4 For instruction:   %conv = sext i16 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction:   %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 4 For instruction:   %conv2 = sext i16 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction:   %mul = mul nsw i32 %conv2, %conv
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv> = sext ir<%0> to i32
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv2> = sext ir<%1> to i32
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nsw ir<%conv2>, ir<%conv>
+
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 4: WIDEN-CAST ir<%conv> = sext ir<%0> to i32
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 4: WIDEN-CAST ir<%conv2> = sext ir<%1> to i32
+; CHECK: Cost of 1 for VF 4: WIDEN ir<%mul> = mul nsw ir<%conv2>, ir<%conv>
 ; CHECK: LV: Selecting VF: 4.
 entry:
   %cmp7.not = icmp eq i32 %N, 0
@@ -99,11 +99,11 @@ define hidden i64 @i64_mac_s16(ptr nocapture noundef readonly %a, ptr nocapture
 ; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction:   %conv2 = sext i16 %1 to i64
 ; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction:   %mul = mul nsw i64 %conv2, %conv
 
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %conv = sext i16 %0 to i64
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %conv2 = sext i16 %1 to i64
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %mul = mul nsw i64 %conv2, %conv
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv> = sext ir<%0> to i64
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv2> = sext ir<%1> to i64
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nsw ir<%conv2>, ir<%conv>
 ; CHECK: LV: Selecting VF: 2.
 entry:
   %cmp7.not = icmp eq i32 %N, 0
@@ -136,10 +136,10 @@ define hidden i64 @i64_mac_s32(ptr nocapture noundef readonly %a, ptr nocapture
 ; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction:   %mul = mul i32 %1, %0
 ; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction:   %conv = sext i32 %mul to i64
 
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %0 = load i32, ptr %arrayidx, align 4
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %1 = load i32, ptr %arrayidx1, align 4
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %mul = mul i32 %1, %0
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %conv = sext i32 %mul to i64
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul ir<%1>, ir<%0>
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv> = sext ir<%mul> to i64
 ; CHECK: LV: Selecting VF: 2.
 entry:
   %cmp6.not = icmp eq i32 %N, 0
@@ -172,17 +172,17 @@ define hidden i32 @i32_mac_u8(ptr nocapture noundef readonly %a, ptr nocapture n
 ; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction:   %conv2 = zext i8 %1 to i32
 ; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction:   %mul = mul nuw nsw i32 %conv2, %conv
 
-; CHECK: LV: Found an estimated cost of 3 for VF 2 For instruction:   %0 = load i8, ptr %arrayidx, align 1
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction:   %conv = zext i8 %0 to i32
-; CHECK: LV: Found an estimated cost of 3 for VF 2 For instruction:   %1 = load i8, ptr %arrayidx1, align 1
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction:   %conv2 = zext i8 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %mul = mul nuw nsw i32 %conv2, %conv
-
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction:   %0 = load i8, ptr %arrayidx, align 1
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction:   %conv = zext i8 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction:   %1 = load i8, ptr %arrayidx1, align 1
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction:   %conv2 = zext i8 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction:   %mul = mul nuw nsw i32 %conv2, %conv
+; CHECK: Cost of 3 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv> = zext ir<%0> to i32
+; CHECK: Cost of 3 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv2> = zext ir<%1> to i32
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nuw nsw ir<%conv2>, ir<%conv>
+
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%0> = load
+; CHECK: Cost of 1 for VF 4: WIDEN-CAST ir<%conv> = zext ir<%0> to i32
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 4: WIDEN-CAST ir<%conv2> = zext ir<%1> to i32
+; CHECK: Cost of 1 for VF 4: WIDEN ir<%mul> = mul nuw nsw ir<%conv2>, ir<%conv>
 ; CHECK: LV: Selecting VF: 4.
 entry:
   %cmp7.not = icmp eq i32 %N, 0
@@ -216,17 +216,17 @@ define hidden i32 @i32_mac_u16(ptr nocapture noundef readonly %a, ptr nocapture
 ; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction:   %conv2 = zext i16 %1 to i32
 ; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction:   %mul = mul nuw nsw i32 %conv2, %conv
 
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction:   %conv = zext i16 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction:   %conv2 = zext i16 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %mul = mul nuw nsw i32 %conv2, %conv
-
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction:   %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 4 For instruction:   %conv = zext i16 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction:   %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 4 For instruction:   %conv2 = zext i16 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction:   %mul = mul nuw nsw i32 %conv2, %conv
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv> = zext ir<%0> to i32
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv2> = zext ir<%1> to i32
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nuw nsw ir<%conv2>, ir<%conv>
+
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 4: WIDEN-CAST ir<%conv> = zext ir<%0> to i32
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 4: WIDEN-CAST ir<%conv2> = zext ir<%1> to i32
+; CHECK: Cost of 1 for VF 4: WIDEN ir<%mul> = mul nuw nsw ir<%conv2>, ir<%conv>
 ; CHECK: LV: Selecting VF: 4.
 entry:
   %cmp7.not = icmp eq i32 %N, 0
@@ -260,11 +260,11 @@ define hidden i64 @i64_mac_u16(ptr nocapture noundef readonly %a, ptr nocapture
 ; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction:   %conv2 = zext i16 %1 to i64
 ; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction:   %mul = mul nuw nsw i64 %conv2, %conv
 
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %conv = zext i16 %0 to i64
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %conv2 = zext i16 %1 to i64
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %mul = mul nuw nsw i64 %conv2, %conv
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv> = zext ir<%0> to i64
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv2> = zext ir<%1> to i64
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nuw nsw ir<%conv2>, ir<%conv>
 ; CHECK: LV: Selecting VF: 2.
 entry:
   %cmp8.not = icmp eq i32 %N, 0
@@ -297,10 +297,10 @@ define hidden i64 @i64_mac_u32(ptr nocapture noundef readonly %a, ptr nocapture
 ; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction:   %mul = mul i32 %1, %0
 ; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction:   %conv = zext i32 %mul to i64
 
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %0 = load i32, ptr %arrayidx, align 4
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction:   %1 = load i32, ptr %arrayidx1, align 4
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %mul = mul i32 %1, %0
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction:   %conv = zext i32 %mul to i64
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul ir<%1>, ir<%0>
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv> = zext ir<%mul> to i64
 ; CHECK: LV: Selecting VF: 2.
 entry:
   %cmp6.not = icmp eq i32 %N, 0
diff --git a/llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll b/llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll
index 54cbab78b1e29..573863121966c 100644
--- a/llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll
+++ b/llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll
@@ -19,12 +19,12 @@ target triple = "wasm32-unknown-wasi"
 %struct.FourFloats = type { float, float, float, float }
 
 ; CHECK-LABEL: two_ints_same_op
+; CHECK: LV: Scalar loop costs: 12.
 ; CHECK: Cost of 7 for VF 2: INTERLEAVE-GROUP with factor 2 at %10
+; CHECK: Cost for VF 2: 27 (Estimated cost per lane: 13.5)
 ; CHECK: Cost of 6 for VF 4: INTERLEAVE-GROUP with factor 2 at %10
-; CHECK: LV: Scalar loop costs: 12.
-; CHECK: LV: Vector loop of width 2 costs: 13.
-; CHECK: LV: Vector loop of width 4 costs: 6.
-; CHECK: LV: Selecting VF: 4
+; CHECK: Cost for VF 4: 24 (Estimated cost per lane: 6.0)
+; CHECK: LV: Selecting VF: 4.
 define hidden void @two_ints_same_op(ptr noalias nocapture noundef writeonly %0, ptr nocapture noundef readonly %1, ptr nocapture noundef readonly %2, i32 ...
[truncated]

david-arm · 2026-04-02T08:37:07Z

llvm/test/Transforms/LoopVectorize/RISCV/reg-usage.ll

 ; CHECK-SCALAR-NEXT: LV(REG): Found max usage: 1 item
 ; CHECK-SCALAR-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 3 registers
-; CHECK-LMUL1:       LV(REG): VF = vscale x 2
+; CHECK-LMUL1-LABEL: goo


I don't know how this was correct before!

Yep, I think the check patterns before were not really checking what they were supposed to and would break when the legacy cost model output would be removed

david-arm · 2026-04-02T08:54:01Z

llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll

-; CHECK: LV: Selecting VF: 1
+; CHECK: Cost of 6 for VF 2: REPLICATE ir<%10> = load ir<%9>
+; CHECK: Cost of 6 for VF 2: REPLICATE ir<%12> = load ir<%11>
+; CHECK: Cost for VF 2: 61 (Estimated cost per lane: 30.5)


Looks like it's missing the store costs or have you intentionally dropped them?

No, should be re-added. thanks

Should be added, thanks

david-arm · 2026-04-02T09:00:49Z

llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll

-; CHECK: LV: Found an estimated cost of 4 for VF 16 For instruction: %13 = mul i8
-; CHECK: LV: Found an estimated cost of 6 for VF 16 For instruction: store i8 %19
-; CHECK: LV: Vector loop of width 16 costs: 1.
+; CHECK: Cost of 6 for VF 2: REPLICATE ir<%10> = load ir<%9>


I guess you dropped the muls here because they aren't necessary for the test?

yep, the check lines. in the test are not really consistent throughout the test file, dropping them seemed a bit more consistent with other functions

david-arm · 2026-04-02T12:07:19Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f64-stride-5.ll

-; AVX512:  LV: Found an estimated cost of 40 for VF 32 For instruction: %v4 = load double, ptr %in4, align 8
+; AVX512:  Cost of 9 for VF 2: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0>
+; AVX512:  Cost of 18 for VF 4: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0>
+; AVX512:  Cost of 35 for VF 8: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0>


Do you also want the VFs of 16 and 32?

should be restored, thanks

Should be added back, thanks

david-arm · 2026-04-02T12:07:59Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f64-stride-6.ll

+; AVX512:  Cost of 11 for VF 2: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0>
+; AVX512:  Cost of 21 for VF 4: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0>
+; AVX512:  Cost of 51 for VF 8: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0>
 ;


VFs 16 and 32?

should be restored thanks

Should be added back thanks

david-arm · 2026-04-02T12:31:07Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i64-stride-8.ll

-; AVX512:  LV: Found an estimated cost of 20 for VF 16 For instruction: %v6 = load i64, ptr %in6, align 8
-; AVX512:  LV: Found an estimated cost of 20 for VF 16 For instruction: %v7 = load i64, ptr %in7, align 8
+; AVX512:  Cost of 14 for VF 2: INTERLEAVE-GROUP with factor 8 at %v0, ir<%in0>
+; AVX512:  Cost of 40 for VF 4: INTERLEAVE-GROUP with factor 8 at %v0, ir<%in0>


VFs 8 and 16?

should be added back and filter updated, thanks

david-arm · 2026-04-02T13:15:54Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f64-stride-8.ll

-; AVX512:  LV: Found an estimated cost of 10 for VF 8 For instruction: store double %v6, ptr %out6, align 8
-; AVX512:  LV: Found an estimated cost of 10 for VF 8 For instruction: store double %v7, ptr %out7, align 8
+; AVX512:  Cost of 10 for VF 8: WIDEN store ir<%out0>, ir<%v0>
+; AVX512:  Cost of 10 for VF 8: WIDEN store ir<%out1>, ir<%v1>


Do we need to test every store for each VF? Perhaps one is enough?

Not necessarily, but it would be difficult I think to include them from the filter. I left if as-is for now

david-arm · 2026-04-02T13:22:24Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i64-stride-8.ll

-; AVX512:  LV: Found an estimated cost of 10 for VF 8 For instruction: store i64 %v6, ptr %out6, align 8
-; AVX512:  LV: Found an estimated cost of 10 for VF 8 For instruction: store i64 %v7, ptr %out7, align 8
+; AVX512:  Cost of 10 for VF 8: WIDEN store ir<%out0>, ir<%v>
+; AVX512:  Cost of 10 for VF 8: WIDEN store ir<%out1>, ir<%v1>


Do we need to test all these stores?

Not sure, but I think it would be diffuclt to update the filter to keep only some of them.

Not necessarily, but it would be difficult I think to include them from the filter. I left if as-is for now

david-arm · 2026-04-02T13:22:52Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i64-stride-8.ll

-; AVX512:  LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v3, ptr %out3, align 8
-; AVX512:  LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v4, ptr %out4, align 8
-; AVX512:  LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v5, ptr %out5, align 8
-; AVX512:  LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v6, ptr %out6, align 8


I think the VFs of 4 and 2 have been dropped.

Should be fixed, thanks

should be added back and filter updated, thanks

david-arm · 2026-04-02T13:28:21Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/masked-scatter-i64-with-i8-index.ll

-; SSE42:  LV: Found an estimated cost of 4 for VF 4 For instruction: store i64 %valB, ptr %out, align 8
-; SSE42:  LV: Found an estimated cost of 8 for VF 8 For instruction: store i64 %valB, ptr %out, align 8
-; SSE42:  LV: Found an estimated cost of 16 for VF 16 For instruction: store i64 %valB, ptr %out, align 8
+; SSE42:  Cost of 2 for VF 2: profitable to scalarize store i64 %valB, ptr %out, align 8


This is still the cost from the legacy cost model - it comes from CM.InstsToScalarize in LoopVectorizationCostModel, which is what I'm trying to fix in #187056

Yep, this is just to remove the reliance on the output of legacy computeCost

…cost-checks

fhahn

The latest version adjust filters in tests matching interleave groups to fully match the members of the interleave groups as well, which is closer to the original check lines which also matched all load/stores in a group (just with the first member having the cost assigned, the other members having cost 0)

fhahn · 2026-04-07T10:39:48Z

llvm/test/Transforms/LoopVectorize/RISCV/reg-usage.ll

 ; CHECK-SCALAR-NEXT: LV(REG): Found max usage: 1 item
 ; CHECK-SCALAR-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 3 registers
-; CHECK-LMUL1:       LV(REG): VF = vscale x 2
+; CHECK-LMUL1-LABEL: goo


Yep, I think the check patterns before were not really checking what they were supposed to and would break when the legacy cost model output would be removed

fhahn · 2026-04-07T10:50:19Z

llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll

-; CHECK: LV: Selecting VF: 1
+; CHECK: Cost of 6 for VF 2: REPLICATE ir<%10> = load ir<%9>
+; CHECK: Cost of 6 for VF 2: REPLICATE ir<%12> = load ir<%11>
+; CHECK: Cost for VF 2: 61 (Estimated cost per lane: 30.5)


No, should be re-added. thanks

fhahn · 2026-04-07T10:51:20Z

llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll

-; CHECK: LV: Found an estimated cost of 4 for VF 16 For instruction: %13 = mul i8
-; CHECK: LV: Found an estimated cost of 6 for VF 16 For instruction: store i8 %19
-; CHECK: LV: Vector loop of width 16 costs: 1.
+; CHECK: Cost of 6 for VF 2: REPLICATE ir<%10> = load ir<%9>


yep, the check lines. in the test are not really consistent throughout the test file, dropping them seemed a bit more consistent with other functions

fhahn · 2026-04-07T13:43:05Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f64-stride-5.ll

-; AVX512:  LV: Found an estimated cost of 40 for VF 32 For instruction: %v4 = load double, ptr %in4, align 8
+; AVX512:  Cost of 9 for VF 2: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0>
+; AVX512:  Cost of 18 for VF 4: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0>
+; AVX512:  Cost of 35 for VF 8: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0>


should be restored, thanks

fhahn · 2026-04-07T13:43:14Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-f64-stride-6.ll

+; AVX512:  Cost of 11 for VF 2: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0>
+; AVX512:  Cost of 21 for VF 4: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0>
+; AVX512:  Cost of 51 for VF 8: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0>
 ;


should be restored thanks

fhahn · 2026-04-07T14:54:54Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i64-stride-8.ll

-; AVX512:  LV: Found an estimated cost of 20 for VF 16 For instruction: %v6 = load i64, ptr %in6, align 8
-; AVX512:  LV: Found an estimated cost of 20 for VF 16 For instruction: %v7 = load i64, ptr %in7, align 8
+; AVX512:  Cost of 14 for VF 2: INTERLEAVE-GROUP with factor 8 at %v0, ir<%in0>
+; AVX512:  Cost of 40 for VF 4: INTERLEAVE-GROUP with factor 8 at %v0, ir<%in0>


should be added back and filter updated, thanks

fhahn · 2026-04-07T14:55:30Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-f64-stride-8.ll

-; AVX512:  LV: Found an estimated cost of 10 for VF 8 For instruction: store double %v6, ptr %out6, align 8
-; AVX512:  LV: Found an estimated cost of 10 for VF 8 For instruction: store double %v7, ptr %out7, align 8
+; AVX512:  Cost of 10 for VF 8: WIDEN store ir<%out0>, ir<%v0>
+; AVX512:  Cost of 10 for VF 8: WIDEN store ir<%out1>, ir<%v1>


Not necessarily, but it would be difficult I think to include them from the filter. I left if as-is for now

fhahn · 2026-04-07T14:55:33Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i64-stride-8.ll

-; AVX512:  LV: Found an estimated cost of 10 for VF 8 For instruction: store i64 %v6, ptr %out6, align 8
-; AVX512:  LV: Found an estimated cost of 10 for VF 8 For instruction: store i64 %v7, ptr %out7, align 8
+; AVX512:  Cost of 10 for VF 8: WIDEN store ir<%out0>, ir<%v>
+; AVX512:  Cost of 10 for VF 8: WIDEN store ir<%out1>, ir<%v1>


Not necessarily, but it would be difficult I think to include them from the filter. I left if as-is for now

fhahn · 2026-04-07T14:55:57Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i64-stride-7.ll

-; AVX512:  LV: Found an estimated cost of 20 for VF 16 For instruction: %v6 = load i64, ptr %in6, align 8
+; AVX512:  Cost of 12 for VF 2: INTERLEAVE-GROUP with factor 7 at %v0, ir<%in0>
+; AVX512:  Cost of 35 for VF 4: INTERLEAVE-GROUP with factor 7 at %v0, ir<%in0>
+; AVX512:  Cost of 70 for VF 8: INTERLEAVE-GROUP with factor 7 at %v0, ir<%in0>


should be added back for all VFs and filter updated, thanks

fhahn · 2026-04-07T14:56:26Z

llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-store-i64-stride-8.ll

-; AVX512:  LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v3, ptr %out3, align 8
-; AVX512:  LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v4, ptr %out4, align 8
-; AVX512:  LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v5, ptr %out5, align 8
-; AVX512:  LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v6, ptr %out6, align 8


should be added back and filter updated, thanks

david-arm

LGTM!

…. (#190038) Move remaining tests checking legacy cost output to check the VPlan's cost model output. In some cases, checks become much more compact (checking a single interleave group cost vs checking the individual members which all have the group's cost). In some cases, auto-generation consistently checks all relevant VFs. PR: llvm/llvm-project#190038

fhahn requested review from ElvisWang123, RKSimon, aniragil, ayalz and david-arm April 1, 2026 19:59

llvmbot added backend:RISC-V llvm:transforms labels Apr 1, 2026

david-arm reviewed Apr 2, 2026

View reviewed changes

fhahn added 2 commits April 7, 2026 11:10

Merge remote-tracking branch 'origin/main' into vplan-replace-legacy-…

551882d

…cost-checks

!fixup update tests.

3842de7

fhahn commented Apr 7, 2026

View reviewed changes

david-arm approved these changes Apr 7, 2026

View reviewed changes

fhahn merged commit e382a95 into llvm:main Apr 7, 2026
10 checks passed

fhahn deleted the vplan-replace-legacy-cost-checks branch April 7, 2026 19:24

Conversation

fhahn commented Apr 1, 2026

Uh oh!

llvmbot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

llvmbot commented Apr 1, 2026 •

edited

Loading