Skip to content

Conversation

jakobbotsch
Copy link
Member

On SysV it is possible to have mixed SIMD/integer return registers for calls. This leads to a case where the first return register is a SIMD register, while the second return register is an integer register. In that particular case, the emitter would get confused and expect GC information for the second return register to refer to rdx, when in reality it refers to rax.

Fix the problem by detecting the case where the second register is rax, and communicating the right information back to the emitter. Also add a test case that reliably segfaults under DOTNET_GCStress=C without the fix.

Fix #115815

…rating calls

On SysV it is possible to have mixed SIMD/integer return registers for
calls. This leads to a case where the first return register is a SIMD
register, while the second return register is an integer register. In
that particular case, the emitter would get confused and expect GC
information for the second return register to refer to `rdx`, when in
reality it refers to `rax`.

Fix the problem by detecting the case where the second register is
`rax`, and communicating the right information back to the emitter. Also
add a test case that reliably segfaults under `DOTNET_GCStress=C`
without the fix.
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 21, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@jakobbotsch
Copy link
Member Author

@shushanhf @tomeksowi You may want to check whether LA64/RISCV64 suffer from the same possibility, I am not familiar enough with the ABI there to know.

@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr gcstress0x3-gcstress0xc

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@tomeksowi
Copy link
Member

@shushanhf @tomeksowi You may want to check whether LA64/RISCV64 suffer from the same possibility, I am not familiar enough with the ABI there to know.

Thanks for the heads-up. We have the same problem IMO, the integer field of an int-float struct is passed/returned in a0 regardless whether it comes first or second.

But it doesn't manifest in this testcase because the classification doesn't handle reordered fields well, it bails out here:

FieldDesc* fields = pMT->GetApproxFieldDescListRaw();
int elementTypeIndex = typeIndex;
for (int i = 0; i < nFields; ++i)
{
if (i > 0 && fields[i-1].GetOffset() + fields[i-1].GetSize() > fields[i].GetOffset())
{
LOG((LF_JIT, LL_EVERYTHING, "FpStructInRegistersInfo:%*s "
" * fields %s [%i..%i) and %s [%i..%i) overlap, treat as union\n",
nestingLevel * 4, "",
fields[i-1].GetDebugName(), fields[i-1].GetOffset(), fields[i-1].GetOffset() + fields[i-1].GetSize(),
fields[i].GetDebugName(), fields[i].GetOffset(), fields[i].GetOffset() + fields[i].GetSize()));
return false;
}

I'll try to fix it so KeyValuePair<Container, double> gets lowered.

@jakobbotsch jakobbotsch marked this pull request as ready for review May 21, 2025 22:22
@Copilot Copilot AI review requested due to automatic review settings May 21, 2025 22:22
@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @kunalspathak

@jakobbotsch jakobbotsch requested a review from kunalspathak May 21, 2025 22:23
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a mismatch in GC info for mixed SIMD/integer return registers on SysV when emitting calls, and adds a repro test that would segfault under GC stress without the fix.

  • Adjusts genCallInstruction to swap retSize when the second ABI return register is rax
  • Introduces a new JIT test (Runtime_115815) to catch the issue under DOTNET_GCStress=C
  • Adds a project file for the new test

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
src/tests/JIT/Regression/JitBlue/Runtime_115815/Runtime_115815.csproj Adds project file for the new GC-stress repro test
src/tests/JIT/Regression/JitBlue/Runtime_115815/Runtime_115815.cs Implements the test that exposes the return-reg mismatch
src/coreclr/jit/codegenxarch.cpp Swaps params.retSize and params.secondRetSize when second return register is REG_INTRET

Copy link
Contributor

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jakobbotsch jakobbotsch merged commit 113198b into dotnet:main May 22, 2025
126 of 130 checks passed
@jakobbotsch jakobbotsch deleted the fix-115815 branch May 22, 2025 08:27
@LuckyXu-HF
Copy link
Contributor

@shushanhf @tomeksowi You may want to check whether LA64/RISCV64 suffer from the same possibility, I am not familiar enough with the ABI there to know.

Thanks for the heads-up. We have the same problem IMO, the integer field of an int-float struct is passed/returned in a0 regardless whether it comes first or second.

But it doesn't manifest in this testcase because the classification doesn't handle reordered fields well, it bails out here:

FieldDesc* fields = pMT->GetApproxFieldDescListRaw();
int elementTypeIndex = typeIndex;
for (int i = 0; i < nFields; ++i)
{
if (i > 0 && fields[i-1].GetOffset() + fields[i-1].GetSize() > fields[i].GetOffset())
{
LOG((LF_JIT, LL_EVERYTHING, "FpStructInRegistersInfo:%*s "
" * fields %s [%i..%i) and %s [%i..%i) overlap, treat as union\n",
nestingLevel * 4, "",
fields[i-1].GetDebugName(), fields[i-1].GetOffset(), fields[i-1].GetOffset() + fields[i-1].GetSize(),
fields[i].GetDebugName(), fields[i].GetOffset(), fields[i].GetOffset() + fields[i].GetSize()));
return false;
}

I'll try to fix it so KeyValuePair<Container, double> gets lowered.

Thank you, LA64 has the same result with RV64.
Besides, LA64 will get stuck of this new testcase Runtime_115815 under DOTNET_GCStress=1/3, and linux-x64 will also stuck (at least 2h with cpu 100%) under DOTNET_GCStress=1/3, do you encounter the same situation?

@jakobbotsch
Copy link
Member Author

Thank you, LA64 has the same result with RV64.
Besides, LA64 will get stuck of this new testcase Runtime_115815 under DOTNET_GCStress=1/3, and linux-x64 will also stuck (at least 2h with cpu 100%) under DOTNET_GCStress=1/3, do you encounter the same situation?

Yeah, looks like 1000000 allocations is too much for GCStress=1. Thanks for letting me know. Let me open a PR to decrease the number of allocations.

SimaTian pushed a commit that referenced this pull request May 27, 2025
…rating calls (#115827)

On SysV it is possible to have mixed SIMD/integer return registers for
calls. This leads to a case where the first return register is a SIMD
register, while the second return register is an integer register. In
that particular case, the emitter would get confused and expect GC
information for the second return register to refer to `rdx`, when in
reality it refers to `rax`.

Fix the problem by detecting the case where the second register is
`rax`, and communicating the right information back to the emitter. Also
add a test case that reliably segfaults under `DOTNET_GCStress=C`
without the fix.
@github-actions github-actions bot locked and limited conversation to collaborators Jun 22, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reproducible segfaults during GC on .NET 8/9 Linux x64
4 participants