New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Fix async lock #4485

Merged

mconnew merged 2 commits into dotnet:main from mconnew:FixAsyncLock

Sep 20, 2021

Member

mconnew commented Jan 8, 2021

Fixed a SocketConnection memory leak where we were pinning the SocketConnection and buffer in memory after the socket is closed due to not cancelling timers. NetFx doesn't actually support synchronous completion of Socket.SendAsync or Socket.ReceiveAsync so this bug wasn't discovered on NetFx. There was also a bug with not disposing the System.Threading.Timer instances at close. Switched back to IOThreadTimer that NetFx uses which is cheap to cancel.
Fixed IOThreadTimer issue where the first WaitableTimer wasn't being reset so could have caused what would effectively be an asynchronous while(true) loop.
Fixed AsyncLock to correctly use a static AsyncLocal instead of an instance. This was causing a memory leak. Adjusted logic to account for multiple AsyncLock's on the same ExecutionContext. This was the cause of #4435.
Fixes #4435
Other issues were found while investigating #4435

Member Author

mconnew commented Jan 8, 2021

@stephentoub, could you look at my AsyncLock implementation and check I haven't missed anything else obvious?

HongGit approved these changes

View reviewed changes

stephentoub reviewed

View reviewed changes

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs Outdated Show resolved Hide resolved

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs Outdated Show resolved Hide resolved

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs Outdated Show resolved Hide resolved

Base automatically changed from master to main

February 6, 2021 00:42

mconnew force-pushed the FixAsyncLock branch from a75fc62 to 8f7c301 Compare

August 18, 2021 20:22

Member Author

mconnew commented Aug 18, 2021

This PR also fixes a couple of minor issues discovered while debugging the AsyncLock where I found the ExecutionContext was being held on to by the Timer based IOTimer (which IOThreadTimer doesn't do).

Member Author

mconnew commented Aug 19, 2021

@stephentoub, can you take a look at my new version?

mconnew requested review from HongGit and stephentoub

August 19, 2021 17:40

stephentoub reviewed

View reviewed changes

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs Outdated Show resolved Hide resolved

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs Show resolved Hide resolved

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs Outdated

-                          _lockTakenCallStackString = _lockTakenCallStack.ToString();
-              #endif
-                          return _semaphoreRelease;
+                          await currentSemaphore.WaitAsync(cancellationToken);

Member

stephentoub Aug 26, 2021

Is it by design that the semaphore which was rented from the pool doesn't get returned if this wait is canceled?

Member Author

mconnew Aug 27, 2021

It wasn't. But I just realized, am I going to have a problem here? After the await, if I put the currentSemaphore back as _currentSemaphore.Value, is that going to work? The await would have caused the EC to be captured in this method and putting the previous SemaphoreSlim back is going to cause the EC to be copied and the new copy modified. When the caller completes awaiting TakeLockAsync, it's original EC is going to be restored which has the new SemaphoreSlim set in _currentSemaphore.Value as that was done in a Task returning sync implemented method before the call to the async method. It looks like I can't support cancellation with this implementation, have I understood that correctly? Not supporting cancellation would be fine as you can't cancel a regular lock statement.

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs Outdated Show resolved Hide resolved

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs Outdated

+                          // Ensure not in use
+                          await _topLevelSemaphore.WaitAsync();
+                          _topLevelSemaphore.Release();

Member

stephentoub Aug 26, 2021

This code suggests there's a concern the semaphore was potentially in use. Given that concern, what prevents it from being used by whatever that other code was after this Release call? Is this just a best effort thing, and the expectation is any use of _topLevelSemaphore after DisposeAsync has been invoked is erroneous?

Member Author

mconnew Aug 27, 2021

It's more than a best effort. Calls to TakeLock and TakeLockAsync are protected by an if(_disposed) so once DisposeAsync has been called (which sets _disposed), neither of those methods can be called. This means if DisposeAsync is able to acquire the semaphore, it's guaranteed no other thread is able to. This guarantees it's safe to return it to the pool without risk of a use after release type of problem. The idea is also that if some code tries to cleanup an AsyncLock (e.g. channel is being closed) and some method still has the lock acquired, the DisposeAsync will wait for that code to release the lock before completing. So it's to ensure correctness, and to allow disposing while something is still in flight safely.

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs Outdated Show resolved Hide resolved

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs Show resolved Hide resolved

ssa3512 reviewed

View reviewed changes

src/System.Private.ServiceModel/src/System.Private.ServiceModel.csproj

@@ @@ -54,6 +55,8 @@ @@
                   <PackageReference Include="System.Security.Principal.Windows" Version="$(SystemSecurityPrincipalWindowsPackageVersion)" />
                   <PackageReference Include="System.Security.Cryptography.Xml" Version="$(SystemSecurityCryptographyXmlPackageVersion)" />
                   <PackageReference Include="System.Numerics.Vectors" Version="$(SystemNumericsVectorsPackageVersion)" />
+                  <PackageReference Include="Microsoft.Bcl.AsyncInterfaces" Version="5.0.0" />
+                  <PackageReference Include="Microsoft.Extensions.ObjectPool" Version="5.0.9" />

ssa3512 Aug 27, 2021

Is is possible to depend on lower versions of these packages? Due to the way the Azure Functions host works for .NET core 3.1 (in process model) it overrides any user assemblies with the 3.x version, effectively creating an upper bound of 3.x for the Microsoft.Extensions packages. This in turn creates runtime errors with FileNotFoundException when any dependencies reference an extensions package over 3.x. Packages like EF Core has the same issue - you can't use anything over 3.x on the functions in-process host.

Member Author

mconnew Aug 27, 2021

To fill everyone in on our discussion on gitter in case someone else finds this via a search engine, the solution was to add the 3.x versions of these packages to the top level project. This then causes a build warning NU1605 because a package downgrade was forced. Apparently the .NET SDK treats that warning as a build error so you need to add <NoWarn>NU1605</NoWarn> to your project.

stephentoub approved these changes

View reviewed changes

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs Outdated Show resolved Hide resolved

src/System.Private.ServiceModel/src/Internals/System/Runtime/AsyncLock.cs

               namespace System.Runtime
               {
-                  internal class AsyncLock
+                  internal class AsyncLock : IAsyncDisposable

Member

stephentoub Sep 20, 2021

It feels like this deserves an extensive set of tests, and I don't see any in this PR. What you have here looks strictly more correct than what you previously had, though I'd feel better about it if there were tests validating all the scenarios required by its internal usage.

Member Author

mconnew Sep 20, 2021

The scenario required by it's internal usage is reliable sessions, and we have pretty extensive tests around that functionality going so far as to intercept the underlying HTTP requests to mess around with the timing and/or delivery of messages to trigger all the behaviors that ReliableSessions provides. The original implementation worked from a lock standpoint, but leaked memory.

mconnew added 2 commits

September 20, 2021 12:32


          Prevent EC capture causing AsyncLocal leak

4a6a67b


          Fix some problems, PR feedback

8821e1c

mconnew force-pushed the FixAsyncLock branch from c0e1069 to 8821e1c Compare

September 20, 2021 19:32

mconnew merged commit f2d0cbb into dotnet:main

mconnew deleted the FixAsyncLock branch

September 20, 2021 20:45

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet