Skip to content

[Bitcode] Fix nondeterministic phi instruction order #151006

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

HackAttack
Copy link

@HackAttack HackAttack commented Jul 28, 2025

Currently there can be nondeterminism of the form

- <INST_PHI op0=307 op1=2 op2=22 op3=67 op4=29 op5=111 op6=57 ...>
+ <INST_PHI op0=307 op1=2 op2=22 op3=67 op4=29 op5=157 op6=71 ...>

in generated bitcode. Stably sorting the use-list entries solves that.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Jul 28, 2025

@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Michael Hackner (HackAttack)

Changes

Currently there can be nondeterminism of the form

- &lt;INST_PHI op0=307 op1=2 op2=22 op3=67 op4=29 op5=111 op6=57 ...&gt;
+ &lt;INST_PHI op0=307 op1=2 op2=22 op3=67 op4=29 op5=157 op6=71 ...&gt;

in generated bitcode. Sorting predecessors here solves that.


Full diff: https://github.com/llvm/llvm-project/pull/151006.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/SSAUpdater.cpp (+8-1)
diff --git a/llvm/lib/Transforms/Utils/SSAUpdater.cpp b/llvm/lib/Transforms/Utils/SSAUpdater.cpp
index 49d0d9584347e..ed03066a50dca 100644
--- a/llvm/lib/Transforms/Utils/SSAUpdater.cpp
+++ b/llvm/lib/Transforms/Utils/SSAUpdater.cpp
@@ -122,7 +122,14 @@ Value *SSAUpdater::GetValueInMiddleOfBlock(BasicBlock *BB) {
     }
   } else {
     bool isFirstPred = true;
-    for (BasicBlock *PredBB : predecessors(BB)) {
+
+    // Sort predecessors to get deterministic PHI operand ordering.
+    SmallVector<BasicBlock *, 8> SortedPreds(predecessors(BB));
+    llvm::sort(SortedPreds, [](BasicBlock *A, BasicBlock *B) {
+      return A->getNumber() < B->getNumber();
+    });
+
+    for (BasicBlock *PredBB : SortedPreds) {
       Value *PredVal = GetValueAtEndOfBlock(PredBB);
       PredValues.push_back(std::make_pair(PredBB, PredVal));
 

@HackAttack

This comment was marked as outdated.

@HackAttack

This comment was marked as resolved.

@HackAttack HackAttack force-pushed the phi-instruction-determinism branch from 142f3db to 6848eca Compare July 28, 2025 21:34
@llvmbot llvmbot added backend:AArch64 backend:AMDGPU llvm:analysis Includes value tracking, cost tables and constant folding labels Jul 28, 2025
@HackAttack HackAttack changed the title [SSAUpdater] Fix nondeterministic phi instruction order in bitcode [Bitcode] Fix nondeterministic phi instruction order Jul 29, 2025
@HackAttack HackAttack force-pushed the phi-instruction-determinism branch from a7079c0 to 2a4a81e Compare July 29, 2025 04:07
Copy link

github-actions bot commented Jul 29, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Currently there can be nondeterminism of the form

```diff
- <INST_PHI op0=307 op1=2 op2=22 op3=67 op4=29 op5=111 op6=57 ...>
+ <INST_PHI op0=307 op1=2 op2=22 op3=67 op4=29 op5=157 op6=71 ...>
```

in generated bitcode. Stably sorting the use-list entries solves that.
@HackAttack HackAttack force-pushed the phi-instruction-determinism branch from 2a4a81e to 97cbd1f Compare July 29, 2025 17:12
@efriedma-quic efriedma-quic requested a review from nikic July 29, 2025 17:27
Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a test for this? (Write some IR to bitcode, read it back, make sure the use order remains consistent, or something like that.)

I'm a little concerned that the comments suggest that the OM.lookup( calls are supposed to succeed... but they must be failing if two entries are comparing equal. So maybe this algorithm isn't working the way it's supposed to. @nikic, do you remember how this works?

@HackAttack
Copy link
Author

Is it possible to add a test for this? (Write some IR to bitcode, read it back, make sure the use order remains consistent, or something like that.)

I’m not sure how viable that is since any such test would itself be nondeterministic/flaky—sometimes it will detect nondeterminism, sometimes not. My manual testing involves running the same build repeatedly until I get a different output. Usually the nondeterminism, when it exists, shows up after 3–10 runs; and when I test a build that gets to 50 or so without differing outputs, I call that fixed…but it’s not guaranteed.

@efriedma-quic
Copy link
Collaborator

Obviously you can't write a test that shows a non-deterministic issue completely reliably. The point is to show what this is fixing, and hopefully triggers on someone's machine if a similar issue shows up again. So a test that has non-deterministic output without this patch is sufficient. (With the patch, the test should produce deterministic output.)

@HackAttack
Copy link
Author

I’m out of my depth here; I tried to use Cursor to generate an appropriate test but haven’t gotten the test to fail without this patch, so it’s probably not correct. I can continue iterating with some guidance, but I don’t really have an understanding of what I’m looking at.

AI-generated test (not sensitive enough)
; Test for deterministic phi instruction operand order in bitcode output.
; 
; This test addresses a nondeterminism bug where phi instruction operands
; could have different orderings in bitcode between runs, like:
; - <INST_PHI op0=307 op1=2 op2=22 op3=67 op4=29 op5=111 op6=57 ...>
; + <INST_PHI op0=307 op1=2 op2=22 op3=67 op4=29 op5=157 op6=71 ...>
;
; The fix was changing llvm::sort() to llvm::stable_sort() in 
; predictValueUseListOrderImpl() to ensure consistent ordering when 
; operands compare equal in the sorting function.
;
; This test creates conditions where multiple uses have the same operand number,
; which triggers the sorting instability that the stable_sort fix addresses.
;
; RUN: llvm-as < %s | llvm-bcanalyzer -dump | FileCheck %s
; RUN: llvm-as < %s | llvm-dis | llvm-as | llvm-bcanalyzer -dump | FileCheck %s

define i32 @test_phi_deterministic_order(i32 %x, i32 %y, i32 %z) {
entry:
  %cmp1 = icmp sgt i32 %x, 0
  br i1 %cmp1, label %if1, label %else1

if1:
  %add1 = add i32 %x, 1
  br label %merge1

else1:
  %sub1 = sub i32 %x, 1  
  br label %merge1

merge1:
  %phi1 = phi i32 [ %add1, %if1 ], [ %sub1, %else1 ]
  %cmp2 = icmp sgt i32 %y, 0
  br i1 %cmp2, label %if2, label %else2

if2:
  %add2 = add i32 %phi1, %y
  br label %merge2

else2:
  %sub2 = sub i32 %phi1, %y
  br label %merge2

merge2:
  %phi2 = phi i32 [ %add2, %if2 ], [ %sub2, %else2 ]
  %cmp3 = icmp sgt i32 %z, 0
  br i1 %cmp3, label %if3, label %else3

if3:
  %mul1 = mul i32 %phi2, %z
  br label %merge3

else3:
  %div1 = sdiv i32 %phi2, %z
  br label %merge3

merge3:
  %phi3 = phi i32 [ %mul1, %if3 ], [ %div1, %else3 ]
  %cmp4 = icmp sgt i32 %phi3, 100
  br i1 %cmp4, label %final1, label %final2

final1:
  ; Create multiple uses of phi1 with the same operand number (0)
  ; This should trigger the sorting instability
  %val1 = add i32 %phi1, %phi1  ; phi1 used twice in operand 0
  br label %exit

final2:
  ; Create multiple uses of phi2 with the same operand number (0)  
  %val2 = sub i32 %phi2, %phi2  ; phi2 used twice in operand 0
  br label %exit

exit:
  %result = phi i32 [ %val1, %final1 ], [ %val2, %final2 ]
  ret i32 %result
}

; Verify that phi instructions have deterministic operand encoding
; The stable_sort fix ensures consistent ordering across runs
; CHECK: <INST_PHI op0=1 op1=4 op2=1 op3=2 op4=2/>
; CHECK: <INST_PHI op0=1 op1=4 op2=4 op3=2 op4=5/>
; CHECK: <INST_PHI op0=1 op1=4 op2=7 op3=2 op4=8/>
; CHECK: <INST_PHI op0=1 op1=4 op2=10 op3=2 op4=11/>
; CHECK: <INST_PHI op0=1 op1=4 op2=1 op3=2 op4=2/>

; Function that creates more complex use patterns with same operand numbers
define i32 @test_multiple_uses_same_operand(i32 %a, i32 %b) {
entry:
  %cond = icmp eq i32 %a, %b
  br i1 %cond, label %then, label %else

then:
  %val1 = add i32 %a, 10
  br label %merge

else:
  %val2 = sub i32 %a, 10
  br label %merge

merge:
  %phi = phi i32 [ %val1, %then ], [ %val2, %else ]
  
  ; Create multiple instructions that use phi multiple times with same operand numbers
  %use1 = add i32 %phi, %phi      ; phi used twice in operand 0
  %use2 = mul i32 %use1, %phi     ; phi used in operand 0 again
  %use3 = and i32 %phi, %phi      ; phi used twice in operand 0
  %use4 = or i32 %use3, %phi      ; phi used in operand 0 again
  
  ; Combine all uses
  %result = add i32 %use2, %use4
  ret i32 %result
} 

@efriedma-quic
Copy link
Collaborator

I assume you have some IR somewhere that does reproduce the issue... can you reduce that IR using llvm-reduce?

@nikic
Copy link
Contributor

nikic commented Jul 30, 2025

I'm a little concerned that the comments suggest that the OM.lookup( calls are supposed to succeed... but they must be failing if two entries are comparing equal. So maybe this algorithm isn't working the way it's supposed to. @nikic, do you remember how this works?

The preceding loop only puts entries into List which have an OrderMap entry. So these lookups are indeed not supposed to fail.

TBH it's not immediately obvious under which circumstances the stable sort would make a difference here.

@HackAttack
Copy link
Author

I assume you have some IR somewhere that does reproduce the issue... can you reduce that IR using llvm-reduce?

The IR is nondeterministic in the same way.

-  %250 = phi { ptr, i32 } [ %248, %247 ], [ %295, %284 ], [ %375, %378 ], [ %375, %374 ], [ %440, %443 ], [ %440, %439 ], [ %339, %338 ], [ %326, %325 ], [ %295, %543 ], [ %548, %546 ] ; [#uses=1 type={ ptr, i32 }]
+  %250 = phi { ptr, i32 } [ %248, %247 ], [ %295, %284 ], [ %339, %338 ], [ %326, %325 ], [ %375, %378 ], [ %375, %374 ], [ %440, %443 ], [ %440, %439 ], [ %295, %543 ], [ %548, %546 ] ; [#uses=1 type={ ptr, i32 }]

llvm-reduce seems to need a test case to begin with, so I’m not sure how I would use it to create one.

Here’s a complete repro (requires Rust toolchain):

git clone https://github.com/apache/datafusion.git --depth 1 -b 36.0.0
cd datafusion
cargo update --package chrono --precise 0.4.39
cargo build --release -p datafusion-expr
cp target/release/libdatafusion_expr.rlib 1.rlib
cp 1.rlib 2.rlib
# Will terminate
while diff -q 1.rlib 2.rlib; do rm -rf target/release/deps/libdatafusion_* target/release/libdatafusion_expr.* && cargo build --release -p datafusion-expr && cp target/release/libdatafusion_expr.rlib 2.rlib; done

And to show this fix works:

git clone https://github.com/rust-lang/rust --depth 1 -b 1.88.0 ../rust
cd ../rust
git submodule update --init --depth 1 src/llvm-project
curl -fsSL https://github.com/llvm/llvm-project/pull/151006.patch | git -C src/llvm-project apply
./x build library --set llvm.download-ci-llvm=false
cd ../datafusion
export RUSTC=../rust/build/host/stage1/bin/rustc
cargo build --release -p datafusion-expr
cp target/release/libdatafusion_expr.rlib 1.rlib
cp 1.rlib 2.rlib
# Will run forever
while diff -q 1.rlib 2.rlib; do rm -rf target/release/deps/libdatafusion_* target/release/libdatafusion_expr.* && cargo build --release -p datafusion-expr && cp target/release/libdatafusion_expr.rlib 2.rlib; done

@efriedma-quic
Copy link
Collaborator

Does that LLVM build have assertions enabled? Can you try enabling assertions to see if anything comes up?

@efriedma-quic
Copy link
Collaborator

The loop before checks the User in the OrderMap before the loop, so LID and RID can't be zero. LU != RU, so we're dealing with two different Uses. Since these changes are having an effect, LID == RID, and LU->getOperandNo() == RU->getOperandNo(). The OrderMap should, by construction not contain any duplicate entries, so LID == RID implies LU->getUser() == RU->getUser(). This is a contradiction: if two Use pointers have the same user, and the same operand number, they must be the same use.

In other words, the following assertion should never trigger:

diff --git a/llvm/lib/Bitcode/Writer/ValueEnumerator.cpp b/llvm/lib/Bitcode/Writer/ValueEnumerator.cpp
index e133abe..ef99cd3 100644
--- a/llvm/lib/Bitcode/Writer/ValueEnumerator.cpp
+++ b/llvm/lib/Bitcode/Writer/ValueEnumerator.cpp
@@ -209,6 +209,8 @@ static void predictValueUseListOrderImpl(const Value *V, const Function *F,
       return true;
     }

+    assert(LU->getUser() == RU->getUser());
+
     // LID and RID are equal, so we have different operands of the same user.
     // Assume operands are added in order for all instructions.
     if (LID <= ID)

@HackAttack
Copy link
Author

Ok, I discovered something interesting/embarrassing: the nondeterminism seems to go away when running rustc against a custom-built LLVM regardless of whether this patch is applied! So there’s currently not enough evidence to conclude that the bug is with LLVM. I’ll close this and link a bug against rust-lang/rust. Thank you for your time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 backend:AMDGPU llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants