[LV] Update remaining tests to use VPlan cost output (NFC).#190038
[LV] Update remaining tests to use VPlan cost output (NFC).#190038
Conversation
Move remaining tests checking legacy cost output to check the VPlan's cost model output. In some cases, checks become much more compact (checking a single interleave group cost vs checking the individual members which all have the group's cost). In some cases, auto-generation consistently checks all relevant VFs.
|
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-risc-v Author: Florian Hahn (fhahn) ChangesMove remaining tests checking legacy cost output to check the VPlan's cost model output. In some cases, checks become much more compact (checking a single interleave group cost vs checking the individual members which all have the group's cost). In some cases, auto-generation consistently checks all relevant VFs. Patch is 1.10 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/190038.diff 115 Files Affected:
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/binop-costs.ll b/llvm/test/Transforms/LoopVectorize/AArch64/binop-costs.ll
index ff1dee41e62bf..8e4c6d470c9be 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/binop-costs.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/binop-costs.ll
@@ -17,12 +17,6 @@ define void @udiv_rhs_opt_cost(ptr %dst) #0 {
; CHECK: Cost of 0 for VF vscale x 2: IR %div = udiv i8 %iv.trunc, 3
; CHECK: Cost of 5 for VF vscale x 4: CLONE ir<%div> = udiv vp<[[VP7]]>, ir<3>
; CHECK: Cost of 0 for VF vscale x 4: IR %div = udiv i8 %iv.trunc, 3
-; CHECK: LV: Found an estimated cost of 5 for VF 1 For instruction: %div = udiv i8 %iv.trunc, 3
-; CHECK: LV: Found an estimated cost of 5 for VF 2 For instruction: %div = udiv i8 %iv.trunc, 3
-; CHECK: LV: Found an estimated cost of 5 for VF 4 For instruction: %div = udiv i8 %iv.trunc, 3
-; CHECK: LV: Found an estimated cost of 5 for VF vscale x 1 For instruction: %div = udiv i8 %iv.trunc, 3
-; CHECK: LV: Found an estimated cost of 5 for VF vscale x 2 For instruction: %div = udiv i8 %iv.trunc, 3
-; CHECK: LV: Found an estimated cost of 5 for VF vscale x 4 For instruction: %div = udiv i8 %iv.trunc, 3
;
entry:
br label %loop
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-zext-costs.ll b/llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-zext-costs.ll
index a44a16455445c..bcd1c28318450 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-zext-costs.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-zext-costs.ll
@@ -8,6 +8,7 @@ target triple = "aarch64-unknown-linux-gnu"
define void @zext_i8_i16(ptr noalias nocapture readonly %p, ptr noalias nocapture %q, i32 %len) #0 {
; CHECK-COST-LABEL: LV: Checking a loop in 'zext_i8_i16'
+; CHECK-COST: LV: Found an estimated cost of 0 for VF 1 For instruction: %conv = zext i8 %0 to i32
; CHECK-COST: Cost of 1 for VF 2: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
; CHECK-COST: Cost of 1 for VF 4: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
; CHECK-COST: Cost of 1 for VF 8: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
@@ -16,7 +17,6 @@ define void @zext_i8_i16(ptr noalias nocapture readonly %p, ptr noalias nocaptur
; CHECK-COST: Cost of 1 for VF vscale x 2: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
; CHECK-COST: Cost of 1 for VF vscale x 4: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
; CHECK-COST: Cost of 0 for VF vscale x 8: WIDEN-CAST ir<%conv> = zext ir<%0> to i16
-; CHECK-COST: LV: Found an estimated cost of 0 for VF 1 For instruction: %conv = zext i8 %0 to i32
; CHECK-LABEL: define void @zext_i8_i16
; CHECK-SAME: (ptr noalias readonly captures(none) [[P:%.*]], ptr noalias captures(none) [[Q:%.*]], i32 [[LEN:%.*]]) #[[ATTR0:[0-9]+]] {
; CHECK-NEXT: entry:
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage.ll b/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage.ll
index 99139da67bb78..b0738cad80064 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/reg-usage.ll
@@ -84,22 +84,26 @@ define void @goo(ptr nocapture noundef %a, i32 noundef signext %n) {
; CHECK-SCALAR: LV(REG): VF = 1
; CHECK-SCALAR-NEXT: LV(REG): Found max usage: 1 item
; CHECK-SCALAR-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 3 registers
-; CHECK-LMUL1: LV(REG): VF = vscale x 2
+; CHECK-LMUL1-LABEL: goo
+; CHECK-LMUL1: LV(REG): VF = vscale x 1
; CHECK-LMUL1-NEXT: LV(REG): Found max usage: 2 item
-; CHECK-LMUL1-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 6 registers
-; CHECK-LMUL1-NEXT: LV(REG): RegisterClass: RISCV::VRRC, 2 registers
-; CHECK-LMUL2: LV(REG): VF = vscale x 4
+; CHECK-LMUL1-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 5 registers
+; CHECK-LMUL1-NEXT: LV(REG): RegisterClass: RISCV::VRRC, 1 registers
+; CHECK-LMUL2-LABEL: goo
+; CHECK-LMUL2: LV(REG): VF = vscale x 2
; CHECK-LMUL2-NEXT: LV(REG): Found max usage: 2 item
-; CHECK-LMUL2-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 6 registers
-; CHECK-LMUL2-NEXT: LV(REG): RegisterClass: RISCV::VRRC, 4 registers
-; CHECK-LMUL4: LV(REG): VF = vscale x 8
+; CHECK-LMUL2-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 5 registers
+; CHECK-LMUL2-NEXT: LV(REG): RegisterClass: RISCV::VRRC, 2 registers
+; CHECK-LMUL4-LABEL: goo
+; CHECK-LMUL4: LV(REG): VF = vscale x 4
; CHECK-LMUL4-NEXT: LV(REG): Found max usage: 2 item
-; CHECK-LMUL4-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 6 registers
-; CHECK-LMUL4-NEXT: LV(REG): RegisterClass: RISCV::VRRC, 8 registers
-; CHECK-LMUL8: LV(REG): VF = vscale x 16
+; CHECK-LMUL4-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 5 registers
+; CHECK-LMUL4-NEXT: LV(REG): RegisterClass: RISCV::VRRC, 4 registers
+; CHECK-LMUL8-LABEL: goo
+; CHECK-LMUL8: LV(REG): VF = vscale x 8
; CHECK-LMUL8-NEXT: LV(REG): Found max usage: 2 item
-; CHECK-LMUL8-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 6 registers
-; CHECK-LMUL8-NEXT: LV(REG): RegisterClass: RISCV::VRRC, 16 registers
+; CHECK-LMUL8-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 5 registers
+; CHECK-LMUL8-NEXT: LV(REG): RegisterClass: RISCV::VRRC, 8 registers
entry:
%cmp3 = icmp sgt i32 %n, 0
br i1 %cmp3, label %for.body.preheader, label %for.cond.cleanup
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reduction-cost.ll b/llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reduction-cost.ll
index 10d83f4ad125e..fe39700d1787c 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reduction-cost.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reduction-cost.ll
@@ -3,8 +3,8 @@
; RUN: -prefer-predicate-over-epilogue=predicate-else-scalar-epilogue \
; RUN: -mtriple=riscv64 -mattr=+v -S < %s 2>&1 | FileCheck %s
+; CHECK: Cost of 0 for VF vscale x 4: WIDEN-REDUCTION-PHI ir<%rdx> = phi
; CHECK: Cost of 2 for VF vscale x 4: WIDEN-INTRINSIC vp<%{{.+}}> = call llvm.vp.merge(ir<true>, ir<%add>, ir<%rdx>, vp<%{{.+}}>)
-; CHECK: LV: Found an estimated cost of 2 for VF vscale x 4 For instruction: %rdx = phi i32 [ %start, %entry ], [ %add, %loop ]
define i32 @add(ptr %a, i64 %n, i32 %start) {
entry:
diff --git a/llvm/test/Transforms/LoopVectorize/WebAssembly/int-mac-reduction-costs.ll b/llvm/test/Transforms/LoopVectorize/WebAssembly/int-mac-reduction-costs.ll
index d23c2272d9c0d..9f824d1a963eb 100644
--- a/llvm/test/Transforms/LoopVectorize/WebAssembly/int-mac-reduction-costs.ll
+++ b/llvm/test/Transforms/LoopVectorize/WebAssembly/int-mac-reduction-costs.ll
@@ -11,17 +11,17 @@ define hidden i32 @i32_mac_s8(ptr nocapture noundef readonly %a, ptr nocapture n
; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction: %conv2 = sext i8 %1 to i32
; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction: %mul = mul nsw i32 %conv2, %conv
-; CHECK: LV: Found an estimated cost of 3 for VF 2 For instruction: %0 = load i8, ptr %arrayidx, align 1
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction: %conv = sext i8 %0 to i32
-; CHECK: LV: Found an estimated cost of 3 for VF 2 For instruction: %1 = load i8, ptr %arrayidx1, align 1
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction: %conv2 = sext i8 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %mul = mul nsw i32 %conv2, %conv
-
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction: %0 = load i8, ptr %arrayidx, align 1
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction: %conv = sext i8 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction: %1 = load i8, ptr %arrayidx1, align 1
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction: %conv2 = sext i8 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction: %mul = mul nsw i32 %conv2, %conv
+; CHECK: Cost of 3 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv> = sext ir<%0> to i32
+; CHECK: Cost of 3 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv2> = sext ir<%1> to i32
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nsw ir<%conv2>, ir<%conv>
+
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%0> = load
+; CHECK: Cost of 1 for VF 4: WIDEN-CAST ir<%conv> = sext ir<%0> to i32
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 4: WIDEN-CAST ir<%conv2> = sext ir<%1> to i32
+; CHECK: Cost of 1 for VF 4: WIDEN ir<%mul> = mul nsw ir<%conv2>, ir<%conv>
; CHECK: LV: Selecting VF: 4.
entry:
%cmp7.not = icmp eq i32 %N, 0
@@ -55,17 +55,17 @@ define hidden i32 @i32_mac_s16(ptr nocapture noundef readonly %a, ptr nocapture
; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction: %conv2 = sext i16 %1 to i32
; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction: %mul = mul nsw i32 %conv2, %conv
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction: %conv = sext i16 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction: %conv2 = sext i16 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %mul = mul nsw i32 %conv2, %conv
-
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction: %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 4 For instruction: %conv = sext i16 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction: %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 4 For instruction: %conv2 = sext i16 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction: %mul = mul nsw i32 %conv2, %conv
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv> = sext ir<%0> to i32
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv2> = sext ir<%1> to i32
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nsw ir<%conv2>, ir<%conv>
+
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 4: WIDEN-CAST ir<%conv> = sext ir<%0> to i32
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 4: WIDEN-CAST ir<%conv2> = sext ir<%1> to i32
+; CHECK: Cost of 1 for VF 4: WIDEN ir<%mul> = mul nsw ir<%conv2>, ir<%conv>
; CHECK: LV: Selecting VF: 4.
entry:
%cmp7.not = icmp eq i32 %N, 0
@@ -99,11 +99,11 @@ define hidden i64 @i64_mac_s16(ptr nocapture noundef readonly %a, ptr nocapture
; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction: %conv2 = sext i16 %1 to i64
; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction: %mul = mul nsw i64 %conv2, %conv
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %conv = sext i16 %0 to i64
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %conv2 = sext i16 %1 to i64
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %mul = mul nsw i64 %conv2, %conv
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv> = sext ir<%0> to i64
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv2> = sext ir<%1> to i64
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nsw ir<%conv2>, ir<%conv>
; CHECK: LV: Selecting VF: 2.
entry:
%cmp7.not = icmp eq i32 %N, 0
@@ -136,10 +136,10 @@ define hidden i64 @i64_mac_s32(ptr nocapture noundef readonly %a, ptr nocapture
; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction: %mul = mul i32 %1, %0
; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction: %conv = sext i32 %mul to i64
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %0 = load i32, ptr %arrayidx, align 4
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %1 = load i32, ptr %arrayidx1, align 4
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %mul = mul i32 %1, %0
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %conv = sext i32 %mul to i64
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul ir<%1>, ir<%0>
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv> = sext ir<%mul> to i64
; CHECK: LV: Selecting VF: 2.
entry:
%cmp6.not = icmp eq i32 %N, 0
@@ -172,17 +172,17 @@ define hidden i32 @i32_mac_u8(ptr nocapture noundef readonly %a, ptr nocapture n
; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction: %conv2 = zext i8 %1 to i32
; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction: %mul = mul nuw nsw i32 %conv2, %conv
-; CHECK: LV: Found an estimated cost of 3 for VF 2 For instruction: %0 = load i8, ptr %arrayidx, align 1
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction: %conv = zext i8 %0 to i32
-; CHECK: LV: Found an estimated cost of 3 for VF 2 For instruction: %1 = load i8, ptr %arrayidx1, align 1
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction: %conv2 = zext i8 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %mul = mul nuw nsw i32 %conv2, %conv
-
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction: %0 = load i8, ptr %arrayidx, align 1
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction: %conv = zext i8 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction: %1 = load i8, ptr %arrayidx1, align 1
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction: %conv2 = zext i8 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction: %mul = mul nuw nsw i32 %conv2, %conv
+; CHECK: Cost of 3 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv> = zext ir<%0> to i32
+; CHECK: Cost of 3 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv2> = zext ir<%1> to i32
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nuw nsw ir<%conv2>, ir<%conv>
+
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%0> = load
+; CHECK: Cost of 1 for VF 4: WIDEN-CAST ir<%conv> = zext ir<%0> to i32
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 4: WIDEN-CAST ir<%conv2> = zext ir<%1> to i32
+; CHECK: Cost of 1 for VF 4: WIDEN ir<%mul> = mul nuw nsw ir<%conv2>, ir<%conv>
; CHECK: LV: Selecting VF: 4.
entry:
%cmp7.not = icmp eq i32 %N, 0
@@ -216,17 +216,17 @@ define hidden i32 @i32_mac_u16(ptr nocapture noundef readonly %a, ptr nocapture
; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction: %conv2 = zext i16 %1 to i32
; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction: %mul = mul nuw nsw i32 %conv2, %conv
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction: %conv = zext i16 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 2 For instruction: %conv2 = zext i16 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %mul = mul nuw nsw i32 %conv2, %conv
-
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction: %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 4 For instruction: %conv = zext i16 %0 to i32
-; CHECK: LV: Found an estimated cost of 2 for VF 4 For instruction: %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 0 for VF 4 For instruction: %conv2 = zext i16 %1 to i32
-; CHECK: LV: Found an estimated cost of 1 for VF 4 For instruction: %mul = mul nuw nsw i32 %conv2, %conv
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv> = zext ir<%0> to i32
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 2: WIDEN-CAST ir<%conv2> = zext ir<%1> to i32
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nuw nsw ir<%conv2>, ir<%conv>
+
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%0> = load
+; CHECK: Cost of 0 for VF 4: WIDEN-CAST ir<%conv> = zext ir<%0> to i32
+; CHECK: Cost of 2 for VF 4: WIDEN ir<%1> = load
+; CHECK: Cost of 0 for VF 4: WIDEN-CAST ir<%conv2> = zext ir<%1> to i32
+; CHECK: Cost of 1 for VF 4: WIDEN ir<%mul> = mul nuw nsw ir<%conv2>, ir<%conv>
; CHECK: LV: Selecting VF: 4.
entry:
%cmp7.not = icmp eq i32 %N, 0
@@ -260,11 +260,11 @@ define hidden i64 @i64_mac_u16(ptr nocapture noundef readonly %a, ptr nocapture
; CHECK: LV: Found an estimated cost of 0 for VF 1 For instruction: %conv2 = zext i16 %1 to i64
; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction: %mul = mul nuw nsw i64 %conv2, %conv
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %0 = load i16, ptr %arrayidx, align 2
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %conv = zext i16 %0 to i64
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %1 = load i16, ptr %arrayidx1, align 2
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %conv2 = zext i16 %1 to i64
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %mul = mul nuw nsw i64 %conv2, %conv
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv> = zext ir<%0> to i64
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv2> = zext ir<%1> to i64
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul nuw nsw ir<%conv2>, ir<%conv>
; CHECK: LV: Selecting VF: 2.
entry:
%cmp8.not = icmp eq i32 %N, 0
@@ -297,10 +297,10 @@ define hidden i64 @i64_mac_u32(ptr nocapture noundef readonly %a, ptr nocapture
; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction: %mul = mul i32 %1, %0
; CHECK: LV: Found an estimated cost of 1 for VF 1 For instruction: %conv = zext i32 %mul to i64
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %0 = load i32, ptr %arrayidx, align 4
-; CHECK: LV: Found an estimated cost of 2 for VF 2 For instruction: %1 = load i32, ptr %arrayidx1, align 4
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %mul = mul i32 %1, %0
-; CHECK: LV: Found an estimated cost of 1 for VF 2 For instruction: %conv = zext i32 %mul to i64
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%0> = load
+; CHECK: Cost of 2 for VF 2: WIDEN ir<%1> = load
+; CHECK: Cost of 1 for VF 2: WIDEN ir<%mul> = mul ir<%1>, ir<%0>
+; CHECK: Cost of 1 for VF 2: WIDEN-CAST ir<%conv> = zext ir<%mul> to i64
; CHECK: LV: Selecting VF: 2.
entry:
%cmp6.not = icmp eq i32 %N, 0
diff --git a/llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll b/llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll
index 54cbab78b1e29..573863121966c 100644
--- a/llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll
+++ b/llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll
@@ -19,12 +19,12 @@ target triple = "wasm32-unknown-wasi"
%struct.FourFloats = type { float, float, float, float }
; CHECK-LABEL: two_ints_same_op
+; CHECK: LV: Scalar loop costs: 12.
; CHECK: Cost of 7 for VF 2: INTERLEAVE-GROUP with factor 2 at %10
+; CHECK: Cost for VF 2: 27 (Estimated cost per lane: 13.5)
; CHECK: Cost of 6 for VF 4: INTERLEAVE-GROUP with factor 2 at %10
-; CHECK: LV: Scalar loop costs: 12.
-; CHECK: LV: Vector loop of width 2 costs: 13.
-; CHECK: LV: Vector loop of width 4 costs: 6.
-; CHECK: LV: Selecting VF: 4
+; CHECK: Cost for VF 4: 24 (Estimated cost per lane: 6.0)
+; CHECK: LV: Selecting VF: 4.
define hidden void @two_ints_same_op(ptr noalias nocapture noundef writeonly %0, ptr nocapture noundef readonly %1, ptr nocapture noundef readonly %2, i32 ...
[truncated]
|
| ; CHECK-SCALAR-NEXT: LV(REG): Found max usage: 1 item | ||
| ; CHECK-SCALAR-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 3 registers | ||
| ; CHECK-LMUL1: LV(REG): VF = vscale x 2 | ||
| ; CHECK-LMUL1-LABEL: goo |
There was a problem hiding this comment.
I don't know how this was correct before!
There was a problem hiding this comment.
Yep, I think the check patterns before were not really checking what they were supposed to and would break when the legacy cost model output would be removed
| ; CHECK: LV: Selecting VF: 1 | ||
| ; CHECK: Cost of 6 for VF 2: REPLICATE ir<%10> = load ir<%9> | ||
| ; CHECK: Cost of 6 for VF 2: REPLICATE ir<%12> = load ir<%11> | ||
| ; CHECK: Cost for VF 2: 61 (Estimated cost per lane: 30.5) |
There was a problem hiding this comment.
Looks like it's missing the store costs or have you intentionally dropped them?
There was a problem hiding this comment.
No, should be re-added. thanks
There was a problem hiding this comment.
Should be added, thanks
| ; CHECK: LV: Found an estimated cost of 4 for VF 16 For instruction: %13 = mul i8 | ||
| ; CHECK: LV: Found an estimated cost of 6 for VF 16 For instruction: store i8 %19 | ||
| ; CHECK: LV: Vector loop of width 16 costs: 1. | ||
| ; CHECK: Cost of 6 for VF 2: REPLICATE ir<%10> = load ir<%9> |
There was a problem hiding this comment.
I guess you dropped the muls here because they aren't necessary for the test?
There was a problem hiding this comment.
yep, the check lines. in the test are not really consistent throughout the test file, dropping them seemed a bit more consistent with other functions
| ; AVX512: LV: Found an estimated cost of 40 for VF 32 For instruction: %v4 = load double, ptr %in4, align 8 | ||
| ; AVX512: Cost of 9 for VF 2: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 18 for VF 4: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 35 for VF 8: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0> |
There was a problem hiding this comment.
Do you also want the VFs of 16 and 32?
There was a problem hiding this comment.
should be restored, thanks
There was a problem hiding this comment.
Should be added back, thanks
| ; AVX512: Cost of 11 for VF 2: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 21 for VF 4: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 51 for VF 8: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0> | ||
| ; |
There was a problem hiding this comment.
should be restored thanks
There was a problem hiding this comment.
Should be added back thanks
| ; AVX512: LV: Found an estimated cost of 20 for VF 16 For instruction: %v6 = load i64, ptr %in6, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 20 for VF 16 For instruction: %v7 = load i64, ptr %in7, align 8 | ||
| ; AVX512: Cost of 14 for VF 2: INTERLEAVE-GROUP with factor 8 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 40 for VF 4: INTERLEAVE-GROUP with factor 8 at %v0, ir<%in0> |
There was a problem hiding this comment.
should be added back and filter updated, thanks
| ; AVX512: LV: Found an estimated cost of 10 for VF 8 For instruction: store double %v6, ptr %out6, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 10 for VF 8 For instruction: store double %v7, ptr %out7, align 8 | ||
| ; AVX512: Cost of 10 for VF 8: WIDEN store ir<%out0>, ir<%v0> | ||
| ; AVX512: Cost of 10 for VF 8: WIDEN store ir<%out1>, ir<%v1> |
There was a problem hiding this comment.
Do we need to test every store for each VF? Perhaps one is enough?
There was a problem hiding this comment.
Not necessarily, but it would be difficult I think to include them from the filter. I left if as-is for now
| ; AVX512: LV: Found an estimated cost of 10 for VF 8 For instruction: store i64 %v6, ptr %out6, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 10 for VF 8 For instruction: store i64 %v7, ptr %out7, align 8 | ||
| ; AVX512: Cost of 10 for VF 8: WIDEN store ir<%out0>, ir<%v> | ||
| ; AVX512: Cost of 10 for VF 8: WIDEN store ir<%out1>, ir<%v1> |
There was a problem hiding this comment.
Do we need to test all these stores?
There was a problem hiding this comment.
Not sure, but I think it would be diffuclt to update the filter to keep only some of them.
There was a problem hiding this comment.
Not necessarily, but it would be difficult I think to include them from the filter. I left if as-is for now
| ; AVX512: LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v3, ptr %out3, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v4, ptr %out4, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v5, ptr %out5, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v6, ptr %out6, align 8 |
There was a problem hiding this comment.
I think the VFs of 4 and 2 have been dropped.
There was a problem hiding this comment.
Should be fixed, thanks
There was a problem hiding this comment.
should be added back and filter updated, thanks
| ; SSE42: LV: Found an estimated cost of 4 for VF 4 For instruction: store i64 %valB, ptr %out, align 8 | ||
| ; SSE42: LV: Found an estimated cost of 8 for VF 8 For instruction: store i64 %valB, ptr %out, align 8 | ||
| ; SSE42: LV: Found an estimated cost of 16 for VF 16 For instruction: store i64 %valB, ptr %out, align 8 | ||
| ; SSE42: Cost of 2 for VF 2: profitable to scalarize store i64 %valB, ptr %out, align 8 |
There was a problem hiding this comment.
This is still the cost from the legacy cost model - it comes from CM.InstsToScalarize in LoopVectorizationCostModel, which is what I'm trying to fix in #187056
There was a problem hiding this comment.
Yep, this is just to remove the reliance on the output of legacy computeCost
fhahn
left a comment
There was a problem hiding this comment.
The latest version adjust filters in tests matching interleave groups to fully match the members of the interleave groups as well, which is closer to the original check lines which also matched all load/stores in a group (just with the first member having the cost assigned, the other members having cost 0)
| ; CHECK-SCALAR-NEXT: LV(REG): Found max usage: 1 item | ||
| ; CHECK-SCALAR-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 3 registers | ||
| ; CHECK-LMUL1: LV(REG): VF = vscale x 2 | ||
| ; CHECK-LMUL1-LABEL: goo |
There was a problem hiding this comment.
Yep, I think the check patterns before were not really checking what they were supposed to and would break when the legacy cost model output would be removed
| ; CHECK: LV: Selecting VF: 1 | ||
| ; CHECK: Cost of 6 for VF 2: REPLICATE ir<%10> = load ir<%9> | ||
| ; CHECK: Cost of 6 for VF 2: REPLICATE ir<%12> = load ir<%11> | ||
| ; CHECK: Cost for VF 2: 61 (Estimated cost per lane: 30.5) |
There was a problem hiding this comment.
No, should be re-added. thanks
| ; CHECK: LV: Found an estimated cost of 4 for VF 16 For instruction: %13 = mul i8 | ||
| ; CHECK: LV: Found an estimated cost of 6 for VF 16 For instruction: store i8 %19 | ||
| ; CHECK: LV: Vector loop of width 16 costs: 1. | ||
| ; CHECK: Cost of 6 for VF 2: REPLICATE ir<%10> = load ir<%9> |
There was a problem hiding this comment.
yep, the check lines. in the test are not really consistent throughout the test file, dropping them seemed a bit more consistent with other functions
| ; AVX512: LV: Found an estimated cost of 40 for VF 32 For instruction: %v4 = load double, ptr %in4, align 8 | ||
| ; AVX512: Cost of 9 for VF 2: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 18 for VF 4: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 35 for VF 8: INTERLEAVE-GROUP with factor 5 at %v0, ir<%in0> |
There was a problem hiding this comment.
should be restored, thanks
| ; AVX512: Cost of 11 for VF 2: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 21 for VF 4: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 51 for VF 8: INTERLEAVE-GROUP with factor 6 at %v0, ir<%in0> | ||
| ; |
There was a problem hiding this comment.
should be restored thanks
| ; AVX512: LV: Found an estimated cost of 20 for VF 16 For instruction: %v6 = load i64, ptr %in6, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 20 for VF 16 For instruction: %v7 = load i64, ptr %in7, align 8 | ||
| ; AVX512: Cost of 14 for VF 2: INTERLEAVE-GROUP with factor 8 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 40 for VF 4: INTERLEAVE-GROUP with factor 8 at %v0, ir<%in0> |
There was a problem hiding this comment.
should be added back and filter updated, thanks
| ; AVX512: LV: Found an estimated cost of 10 for VF 8 For instruction: store double %v6, ptr %out6, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 10 for VF 8 For instruction: store double %v7, ptr %out7, align 8 | ||
| ; AVX512: Cost of 10 for VF 8: WIDEN store ir<%out0>, ir<%v0> | ||
| ; AVX512: Cost of 10 for VF 8: WIDEN store ir<%out1>, ir<%v1> |
There was a problem hiding this comment.
Not necessarily, but it would be difficult I think to include them from the filter. I left if as-is for now
| ; AVX512: LV: Found an estimated cost of 10 for VF 8 For instruction: store i64 %v6, ptr %out6, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 10 for VF 8 For instruction: store i64 %v7, ptr %out7, align 8 | ||
| ; AVX512: Cost of 10 for VF 8: WIDEN store ir<%out0>, ir<%v> | ||
| ; AVX512: Cost of 10 for VF 8: WIDEN store ir<%out1>, ir<%v1> |
There was a problem hiding this comment.
Not necessarily, but it would be difficult I think to include them from the filter. I left if as-is for now
| ; AVX512: LV: Found an estimated cost of 20 for VF 16 For instruction: %v6 = load i64, ptr %in6, align 8 | ||
| ; AVX512: Cost of 12 for VF 2: INTERLEAVE-GROUP with factor 7 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 35 for VF 4: INTERLEAVE-GROUP with factor 7 at %v0, ir<%in0> | ||
| ; AVX512: Cost of 70 for VF 8: INTERLEAVE-GROUP with factor 7 at %v0, ir<%in0> |
There was a problem hiding this comment.
should be added back for all VFs and filter updated, thanks
| ; AVX512: LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v3, ptr %out3, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v4, ptr %out4, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v5, ptr %out5, align 8 | ||
| ; AVX512: LV: Found an estimated cost of 0 for VF 4 For instruction: store i64 %v6, ptr %out6, align 8 |
There was a problem hiding this comment.
should be added back and filter updated, thanks
…. (#190038) Move remaining tests checking legacy cost output to check the VPlan's cost model output. In some cases, checks become much more compact (checking a single interleave group cost vs checking the individual members which all have the group's cost). In some cases, auto-generation consistently checks all relevant VFs. PR: llvm/llvm-project#190038
Move remaining tests checking legacy cost output to check the VPlan's cost model output.
In some cases, checks become much more compact (checking a single interleave group cost vs checking the individual members which all have the group's cost). In some cases, auto-generation consistently checks all relevant VFs.