Skip to content

Commit

Permalink
[LICM][MustExec] Make must-exec logic for IV condition commutative (l…
Browse files Browse the repository at this point in the history
…lvm#93150)

MustExec has special logic to determine whether the first loop iteration
will always be executed, by simplifying the IV comparison with the start
value. Currently, this code assumes that the IV is on the LHS of the
comparison, but this is not guaranteed. Make sure it handles the
commuted variant as well.

The changed PhaseOrdering test previously performed peeling to make the
loads dereferenceable -- as a side effect, this also reduced the exit
count by one, avoiding the awkward <= MAX case.

Now we know up-front the the loads are dereferenceable and can be simply
hoisted. As such, we retain the original exit count and now have to
handle it by widening the exit count calculation to i128. This is a
regression, but at least it preserves the vectorization, which was the
original goal. I'm not sure what else can be done about that test.
  • Loading branch information
nikic committed Aug 8, 2024
1 parent fdf8e3e commit 37a94b7
Show file tree
Hide file tree
Showing 3 changed files with 148 additions and 161 deletions.
19 changes: 12 additions & 7 deletions llvm/lib/Analysis/MustExecute.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -135,16 +135,21 @@ static bool CanProveNotTakenFirstIteration(const BasicBlock *ExitBlock,
// todo: this would be a lot more powerful if we used scev, but all the
// plumbing is currently missing to pass a pointer in from the pass
// Check for cmp (phi [x, preheader] ...), y where (pred x, y is known
ICmpInst::Predicate Pred = Cond->getPredicate();
auto *LHS = dyn_cast<PHINode>(Cond->getOperand(0));
auto *RHS = Cond->getOperand(1);
if (!LHS || LHS->getParent() != CurLoop->getHeader())
return false;
auto DL = ExitBlock->getDataLayout();
if (!LHS || LHS->getParent() != CurLoop->getHeader()) {
Pred = Cond->getSwappedPredicate();
LHS = dyn_cast<PHINode>(Cond->getOperand(1));
RHS = Cond->getOperand(0);
if (!LHS || LHS->getParent() != CurLoop->getHeader())
return false;
}

auto DL = ExitBlock->getModule()->getDataLayout();
auto *IVStart = LHS->getIncomingValueForBlock(CurLoop->getLoopPreheader());
auto *SimpleValOrNull = simplifyCmpInst(Cond->getPredicate(),
IVStart, RHS,
{DL, /*TLI*/ nullptr,
DT, /*AC*/ nullptr, BI});
auto *SimpleValOrNull = simplifyCmpInst(
Pred, IVStart, RHS, {DL, /*TLI*/ nullptr, DT, /*AC*/ nullptr, BI});
auto *SimpleCst = dyn_cast_or_null<Constant>(SimpleValOrNull);
if (!SimpleCst)
return false;
Expand Down
3 changes: 1 addition & 2 deletions llvm/test/Transforms/LICM/hoist-mustexec.ll
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,6 @@ fail:
}

; Same as previous case, with commuted icmp.
; FIXME: The load should get hoisted here as well.
define i32 @test3_commuted(ptr noalias nocapture readonly %a) nounwind uwtable {
; CHECK-LABEL: define i32 @test3_commuted(
; CHECK-SAME: ptr noalias nocapture readonly [[A:%.*]]) #[[ATTR1]] {
Expand All @@ -227,14 +226,14 @@ define i32 @test3_commuted(ptr noalias nocapture readonly %a) nounwind uwtable {
; CHECK-NEXT: [[IS_ZERO:%.*]] = icmp eq i32 [[LEN]], 0
; CHECK-NEXT: br i1 [[IS_ZERO]], label [[FAIL:%.*]], label [[PREHEADER:%.*]]
; CHECK: preheader:
; CHECK-NEXT: [[I1:%.*]] = load i32, ptr [[A]], align 4
; CHECK-NEXT: br label [[FOR_BODY:%.*]]
; CHECK: for.body:
; CHECK-NEXT: [[IV:%.*]] = phi i32 [ 0, [[PREHEADER]] ], [ [[INC:%.*]], [[CONTINUE:%.*]] ]
; CHECK-NEXT: [[ACC:%.*]] = phi i32 [ 0, [[PREHEADER]] ], [ [[ADD:%.*]], [[CONTINUE]] ]
; CHECK-NEXT: [[R_CHK:%.*]] = icmp uge i32 [[LEN]], [[IV]]
; CHECK-NEXT: br i1 [[R_CHK]], label [[CONTINUE]], label [[FAIL_LOOPEXIT:%.*]]
; CHECK: continue:
; CHECK-NEXT: [[I1:%.*]] = load i32, ptr [[A]], align 4
; CHECK-NEXT: [[ADD]] = add nsw i32 [[I1]], [[ACC]]
; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[IV]], 1
; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[INC]], 1000
Expand Down
Loading

0 comments on commit 37a94b7

Please sign in to comment.