Skip to content

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented Sep 6, 2025

Mostly to see what happens and how far we can get with this.

@dotnet-policy-service dotnet-policy-service bot added the linkable-framework Issues associated with delivering a linker friendly framework label Sep 6, 2025
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

@VSadov
Copy link
Member Author

VSadov commented Sep 6, 2025

CC: @jakobbotsch @agocke

@VSadov VSadov added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Sep 6, 2025
@VSadov
Copy link
Member Author

VSadov commented Sep 6, 2025

A lot of tests actually run and pass when I run this locally. More than I expected. :-)

@VSadov
Copy link
Member Author

VSadov commented Sep 6, 2025

Some JIT asserts:
(could be Roslyn NYIs, could be JIT issues)

  Assert failure(PID 49480 [0x0000c148], Thread: 48504 [0xbd78]): Assertion failed 'lastBlockILEndOffset < beginOffs' in 'System.Net.Tests.HttpListenerWebSocketTests:ReceiveAsync_ReadBuffer_WithWindowsAuthScheme_Success():this' during 'Generate code' (IL size 302; hash 0xef07e726; Tier0)

      File: E:\dotnet004\runtime\src\coreclr\jit\scopeinfo.cpp:1566
      Image: E:\dotnet004\runtime\artifacts\bin\testhost\net10.0-windows-Release-x64\dotnet.exe
  Assert failure(PID 62300 [0x0000f35c], Thread: 12816 [0x3210]): Assertion failed '!varTypeIsStruct(node) && !varTypeIsStruct(type)' in 'System.IO.Stream:ReadAsync(System.Memory`1[byte],System.Threading.CancellationToken):int:this' during 'Importation' (IL size 40; hash 0xaffe5a9c; FullOpts)

      File: E:\dotnet004\runtime\src\coreclr\jit\gentree.h:4710
      Image: E:\dotnet004\runtime\artifacts\bin\testhost\net10.0-windows-Release-x64\dotnet.exe

@VSadov
Copy link
Member Author

VSadov commented Sep 6, 2025

Something odd is going on with sync context in a few cases:

System.Net.Http.Functional.Tests.PlatformHandler_HttpClientHandler_Asynchrony_Test.ResponseHeadersRead_SynchronizationContextNotUsedByHandler(responseHeadersRead: False, contentMode: BytePerChunk) [FAIL]
        Sync Ctx used:    at System.Environment.get_StackTrace() in E:\dotnet004\runtime\src\libraries\System.Private.CoreLib\src\System\Environment.cs:line 239
           at System.Threading.Tests.TrackingSynchronizationContext.Post(SendOrPostCallback d, Object state)
           at System.Threading.Tasks.AwaitTaskContinuation.RunCallback(ContextCallback callback, Object state, Task& currentTask) in E:\dotnet004\runtime\src\libraries\System.Private.CoreLib\src\System\Threading\Tasks\TaskContinuation.cs:line 697
           at System.Threading.Tasks.Task.RunContinuations(Object continuationObject) in E:\dotnet004\runtime\src\libraries\System.Private.CoreLib\src\System\Threading\Tasks\Task.cs:line 3486
           at System.Threading.Tasks.Task`1.TrySetResult(TResult result) in E:\dotnet004\runtime\src\libraries\System.Private.CoreLib\src\System\Threading\Tasks\Future.cs:line 385
           at System.Threading.Tasks.TaskCompletionSource`1.TrySetResult(TResult result) in E:\dotnet004\runtime\src\libraries\System.Private.CoreLib\src\System\Threading\Tasks\TaskCompletionSource_T.cs:line 218
           at System.Net.Http.WinHttpHandler.StartRequestAsync(WinHttpRequestState state) in E:\dotnet004\runtime\src\libraries\System.Net.Http.WinHttpHandler\src\System\Net\Http\WinHttpHandler.cs:line 1007
           at System.Runtime.CompilerServices.AsyncHelpers.ThunkTaskCore.MoveNext[T,TOps](T task) in E:\dotnet004\runtime\src\coreclr\System.Private.CoreLib\src\System\Runtime\CompilerServices\AsyncHelpers.CoreCLR.cs:line 354
           at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state) in E:\dotnet004\runtime\src\libraries\System.Private.CoreLib\src\System\Threading\ExecutionContext.cs:line 264
           at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread) in E:\dotnet004\runtime\src\libraries\System.Private.CoreLib\src\System\Threading\Tasks\Task.cs:line 2346
           at System.Threading.ThreadPoolWorkQueue.Dispatch() in E:\dotnet004\runtime\src\libraries\System.Private.CoreLib\src\System\Threading\ThreadPoolWorkQueue.cs:line 1154
           at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart() in E:\dotnet004\runtime\src\libraries\System.Private.CoreLib\src\System\Threading\PortableThreadPool.WorkerThread.cs:line 118
           at System.Threading.Thread.StartCallback() in E:\dotnet004\runtime\src\coreclr\System.Private.CoreLib\src\System\Threading\Thread.CoreCLR.cs:line 104

@VSadov
Copy link
Member Author

VSadov commented Sep 9, 2025

I have a fix for the following assert:

  Assert failure(PID 7324 [0x00001c9c], Thread: 29684 [0x73f4]): Assertion failed 'callInfo.SaveAndRestoreSynchronizationContextField && (callInfo.SynchronizationContextLclNum != BAD_VAR_NUM)' in 'System.IO.Compression.Tests.ZipFile_Extract_Stream:ExtractToDirectoryRoundTrip(bool):this' during 'Transform async' (IL size 90; hash 0x550642b2; Tier0)

      File: E:\dotnet004\runtime\src\coreclr\jit\async.cpp:1510
      Image: E:\dotnet004\runtime\artifacts\bin\testhost\net10.0-windows-Release-x64\dotnet.exe

looking at others.

Comment on lines +962 to +968
// TODO: (async) we do not need to save ByRef-containing locals
// as by the spec an await turns them into zero-inited state.
// For now just store/restore as if there are no gc refs.
// This is mostly to handle the "fake" live-across-await byrefs
// in min-opts, since C#-compiled code by itself does not let
// byrefs be live across awaits.
unsigned objCount = layout->HasGCByRef() ? 0 : layout->GetGCPtrCount();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to figure out what is creating these LIR edges that are live across the await and stop doing that. This fix will replace assert with bad codegen instead.

Copy link
Member Author

@VSadov VSadov Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I typically see the assert when compiling Debug code for Task-returning methods with ref-like parameters.
Ex:

        public static Task WhenAll(params ReadOnlySpan<Task> tasks)
        {

async
This method is not async, but it's thunk will be async and will also have a byref-like parameter.
The part that the edge lives across the await is likely a result of Debug not tracking liveness precisely.

Once we have zeroing of byrefs that live across await, we may not need this.
This is not a fix. It is a workaround. - my goal is to get all Libraries tests pass or find something truly blocking.

Copy link
Member Author

@VSadov VSadov Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that it is a result of Debug emit, perhaps forcing thunks to always compile optimized might be an alternative workaround, if there is an easy way to force.

I just thought of this workaround first and it seems working well enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part that the edge lives across the await is likely a result of Debug not tracking liveness precisely.

There are two sources of live state across async calls in JIT IR:

  1. Locals. These behave like IL locals and can be multiply defined and multiply used. They are what liveness analysis treat. They are added by AsyncLiveness::GetLiveLocals.
  2. LIR edges. These are single-def single-use (SDSU) values that are defined in one place in a basic block and consumed later in the same basic block. When they overlap an async call (defined before, used after) they must also be preserved on heap. Liveness analysis does not treat these; they are known to be live when they overlap the call. They are added by AsyncTransformation::LiftLIREdges.

The liveness imprecision only comes into play for (1). However, we already ignore byref locals for these. It happens here:

if ((dsc->TypeIs(TYP_BYREF) && !dsc->IsImplicitByRef()) ||
(dsc->TypeIs(TYP_STRUCT) && dsc->GetLayout()->HasGCByRef()))
{
// Even if these are address exposed we expect them to be dead at
// suspension points. TODO: It would be good to somehow verify these
// aren't obviously live, if the JIT creates live ranges that span a
// suspension point then this makes it quite hard to diagnose that.
return false;
}

That just leaves (2). But since these are known to be live, it is (almost?) always going to be a bug that resulted in these.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I will log an issue on this - to follow up with real fix.

@VSadov
Copy link
Member Author

VSadov commented Sep 11, 2025

I see only 26 failed tests now in Chk/Ret configuration on win-x64.

There are 23 cases of Assertion failed 'lastBlockILEndOffset < beginOffs'
I will look at this next.

There are also 8 cases of Sync Ctx used: . . .
That could be harder to figure. I am not sure it is always deterministic.

Also fixing one thing sometimes exposes another, but so far with every fix/workaround I see more tests passing.

@VSadov
Copy link
Member Author

VSadov commented Sep 17, 2025

The AVs in FinalizeTaskReturningThunk appear to be cased by a rare case where GS cookie check uses the same register as continuation.

Logged an issue on that - #119796 and pushed a possible fix here.

Now my local run is down to:

  • 11 failures in -rc checked -lc debug config
  • 15 failures in -rc checked -lc release config

Comment on lines 105 to 109
// a return register or contain 'this' pointer (keep alive this), or
// a continuation register, since we are generating GS cookie check
// after a GT_RETURN block.
// ARG_1 is EDX or RSI, depending on platform. Either way it fits our reqirements
regGSCheck = REG_ARG_1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to PR this separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make a PR for this and couple other small fixes that seem uncontroversial.

@VSadov
Copy link
Member Author

VSadov commented Sep 18, 2025

/azp run runtime-libraries-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@VSadov
Copy link
Member Author

VSadov commented Sep 18, 2025

rebased onto recent main to pick the fix in #119806

@VSadov VSadov force-pushed the asyncLibs branch 2 times, most recently from 7c33637 to ca576eb Compare September 21, 2025 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-VM-coreclr linkable-framework Issues associated with delivering a linker friendly framework NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) runtime-async
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants