Reduce Pipeline contention redux #36992

benaadams · 2020-05-25T22:14:13Z

Two individual commits #36956 and #36991

The two dlls:

adamsitnik · 2020-05-26T09:28:15Z

@benaadams it gives a lot of socket errors:

RequestsPerSecond:           14,813
Max CPU (%):                 97
WorkingSet (MB):             385
Avg. Latency (ms):           12.67
Startup (ms):                298
First Request (ms):          59.07
Latency (ms):                0.88
Total Requests:              223,667
Duration: (ms)               15,100
Socket Errors:               223,455
Bad Responses:               0
Build Time (ms):             4,001
Published Size (KB):         102,140
SDK:                         5.0.100-preview.6.20272.2
Runtime:                     5.0.0-preview.6.20272.2
ASP.NET Core:                5.0.0-preview.5.20255.6

I've uploaded the trace file to traces/36992

benaadams · 2020-05-26T12:12:00Z

The Kestrel sockets one might be busted :-/

benaadams · 2020-05-26T15:00:50Z

Could you try it without the Kestrel socket change?

benaadams · 2020-05-26T15:53:16Z

Yeah, Flush invalidates the memory, need to rethink the Kestrel one

benaadams · 2020-05-26T16:07:20Z

With updated Kestrel change dotnet/aspnetcore#22225:

Alas it's a race to avoid the lock; but hopefully it should win more often.

benaadams · 2020-05-27T06:15:55Z

Updated the dlls (above); and moved the checks earlier to try to win the data race; also added lots of comments.

And add lots of comments

adamsitnik · 2020-05-27T14:28:03Z

@benaadams the results:

Machine	Benchmark	RPS before	RPS after	locks/s before	locks/s after
Citrine 28 cores	PlaintextPlatform	9,368,072	9,299,033	367	303
	JsonPlatform	1,195,092	1,203,339	28	12
	FortunesPlatform	318,144	314,811	41	41
	Fortunes Batching	416,590	415,700	754	625


Perf 12 cores	PlaintextPlatform	5,942,494	5,675,971	128	57
	JsonPlatform	594,832	588,407	10	9
	FortunesPlatform	130,080	130,236	79	56


ARM 32 cores	PlaintextPlatform	6,017,733	5,878,071	349	238
	JsonPlatform	543,213	522,777	170	94
	FortunesPlatform	94,598	95,610	17	20


Mono 56 cores	PlaintextPlatform	6,952,624	6,943,621	6,035	1,527
	JsonPlatform	1,110,368	1,184,403	284	382


AMD 46 cores	JsonPlatform	7,453,850	6,908,422	482	126
	JsonPlatform	693,225	683,765	64	30
	FortunesPlatform	265,688	261,117	21	15

trace file uploaded to traces/36992

benaadams · 2020-05-27T14:35:51Z

Those results are really strange; when the locks go down it does worse, and when they go up it does better. Wonder what bottleneck it hits when it goes faster, will look at the traces.

adamsitnik · 2020-05-27T14:55:16Z

I've used the PerfView to diff a trace file with and without your changes (it's hidden feature of perfview):

The base is before, Test is after, everything filtered by time to after warmup:

It looks like it's now spending more time in FlushAsync ? This was for JSON I am going to do the same for Plaintext now

adamsitnik · 2020-05-27T15:04:55Z

@benaadams Plaintext:

benaadams · 2020-05-27T16:59:55Z

Hmm... Plaintext (pipelines 16 requests) might be from making the Memory reservation too short; as I shorterned it to 1024 bytes in dotnet/aspnetcore@9d98835

Json that should be more than enough for the single request/response

benaadams · 2020-05-27T17:02:51Z

K, Json might be the Memory block being returned after the callback is triggered, so causing contention on the MemoryPool; will tweak

benaadams · 2020-05-27T17:15:30Z

New dlls in light of the Prefview comparisions

adamsitnik · 2020-05-27T17:38:22Z

The RPS has dropped for Plaintext to 9070k, but the trace I have just shows me which things got better (I need to get more traces):

Compared to previous commits to this PR:

Compared to master;

adamsitnik · 2020-05-27T17:48:13Z

JSON:

new new vs new (almost no changes)

benaadams · 2020-05-28T05:14:08Z

So I'm fairly at a loss to explain this; why all the trace data/metrics improve, but the performance degrades

The main issue I can find is the one highlighted by @adamsitnik in #36992 (comment) and its related to FlushAsync and specifically the call in SocketConnection.ProcessReceives

Even though overall lock acquire time falls. it's lock acquire time jumps, which would throttle the throughput of the input data, adding latency back to offset all the gains:

Before:

After:

Not sure why yet...

benaadams · 2020-05-28T15:38:41Z

Added FlushAsync(bool isMoreData, CancellationToken cancellationToken = default); would need api review, but can test the effect:

benaadams · 2020-05-28T16:02:18Z

Cuts calls to .GetMemory() to 1.8% of previous

adamsitnik · 2020-05-29T13:37:17Z

the plaintext result from "before" is 8kk while it should be more than 9kk. Most probably there was some other work executed on server|client machine.
I am going to re-run them with BenchmarkDriver2 which promises lack of these problems and share new results

adamsitnik · 2020-06-02T14:35:19Z

@benaadams I've run the benchmarks two more times for you, here are the results:

I'll now try to run the JSON benchmark with bombardier and collect traces that would allow for apples vs apples comparison (run x requests, not run for y seconds)

benaadams · 2020-06-02T15:09:43Z

I'd be interested a trace of the Mono Json (after) with those high locks; I assume its run on .NET Core despite the name?

benaadams · 2020-06-02T15:10:22Z

Also the plaintext, with the crazy high locks

adamsitnik · 2020-06-02T15:10:33Z

I'd be interested a trace of the Mono Json (after) with those high locks; I assume its run on .NET Core despite the name?

Yes, Mono is just machine name

adamsitnik · 2020-06-02T15:37:34Z

@sebastienros sth is wrong with the mono machine. It was working fine all day for me and now it fails without a clear reason. Do you have any ideas why?

PS C:\Projects\aspnet_benchmarks\src\BenchmarksDriver> dotnet run -- --display-output --server $mono --client $monoload --connections 512 --jobs ..\BenchmarksApps\Kestrel\PlatformBenchmarks\benchmarks.plaintext.json --scenario PlaintextPlatform --sdk 5.0.100-preview.6.20272.2 --runtime  5.0.0-preview.6.20272.2 --aspnetcoreversion 5.0.0-preview.5.20255.6  --collect-counters --framework netcoreapp5.0 --collect-trace

[05:26:11.967] WARNING: '--self-contained' has been set implicitly as custom local files are used.
[05:26:11.982] Using worker Wrk
[05:26:12.055] Running session '7469a407e9dd4831908b3382a0cf74f8' with description ''
[05:26:12.058] Starting scenario PlaintextPlatform on benchmark server...
[05:26:12.203] Fetching job: $mono/jobs/8
[05:26:13.523] Job has been selected by the server ...
[05:26:14.193] Job is now building ...
[05:26:35.807] Job failed on benchmark server, stopping...
Microsoft (R) Build Engine version 16.7.0-preview-20270-03+bee129d1b for .NET
Copyright (C) Microsoft Corporation. All rights reserved.

  Determining projects to restore...
  Restored /tmp/benchmarks-agent/benchmarks-server-6/rrcz5cy5.ybt/Benchmarks/src/BenchmarksApps/Kestrel/PlatformBenchmarks/PlatformBenchmarks.csproj (in 757 ms).
  You are using a preview version of .NET. See: https://aka.ms/dotnet-core-preview
  PlatformBenchmarks -> /tmp/benchmarks-agent/benchmarks-server-6/rrcz5cy5.ybt/Benchmarks/src/BenchmarksApps/Kestrel/PlatformBenchmarks/bin/Release/netcoreapp5.0/linux-x64/PlatformBenchmarks.dll
  PlatformBenchmarks -> /tmp/benchmarks-agent/benchmarks-server-6/rrcz5cy5.ybt/Benchmarks/src/BenchmarksApps/Kestrel/PlatformBenchmarks/published/


Command dotnet publish PlatformBenchmarks.csproj -c Release -o /tmp/benchmarks-agent/benchmarks-server-6/rrcz5cy5.ybt/Benchmarks/src/BenchmarksApps/Kestrel/PlatformBenchmarks/published /p:BenchmarksAspNetCoreVersion=5.0.0-preview.5.20255.6 /p:MicrosoftAspNetCoreAllPackageVersion=5.0.0-preview.5.20255.6 /p:MicrosoftAspNetCoreAppPackageVersion=5.0.0-preview.5.20255.6 /p:BenchmarksNETStandardImplicitPackageVersion=5.0.0-preview.5.20255.6 /p:BenchmarksNETCoreAppImplicitPackageVersion=5.0.0-preview.5.20255.6 /p:BenchmarksRuntimeFrameworkVersion=5.0.0-preview.6.20272.2 /p:BenchmarksTargetFramework=netcoreapp5.0 /p:MicrosoftNETCoreAppPackageVersion=5.0.0-preview.6.20272.2 /p:MicrosoftWindowsDesktopAppPackageVersion=5.0.0-preview.7.20302.1 /p:NETCoreAppMaximumVersion=99.9 /p:MicrosoftNETCoreApp50PackageVersion=5.0.0-preview.6.20272.2 /p:GenerateErrorForMissingTargetingPacks=false /p:MicrosoftNETPlatformLibrary=Microsoft.NETCore.App /p:RestoreNoCache=true --framework netcoreapp5.0 --self-contained -r linux-x64  returned exit code 0

[05:36:34.163] Deleting scenario 'PlaintextPlatform' on benchmark server...

sebastienros · 2020-06-02T15:42:14Z

http://$mono/jobs/8/output

[STDERR] Unhandled exception. System.IO.IOException: Failed to bind to address http://192.168.1.1:8080: address already in use.

Looks like the port is still in use. Maybe it didn't get cleaned up correctly.

@fanyang-mono Do you think there could be other docker containers still running? In my scripts I force any container starting with benchmarks_ every night just in case one went rogue.

fanyang-mono · 2020-06-02T16:55:19Z

http://$mono/jobs/8/output
[STDERR] Unhandled exception. System.IO.IOException: Failed to bind to address http://192.168.1.1:8080: address already in use.
Looks like the port is still in use. Maybe it didn't get cleaned up correctly.

@fanyang-mono Do you think there could be other docker containers still running? In my scripts I force any container starting with benchmarks_ every night just in case one went rogue.

Restart the server and it is working now.

adamsitnik · 2020-06-02T17:28:16Z

@fanyang-mono @sebastienros thank you!

@benaadams I've captured and uploaded the traces:

benaadams · 2020-06-03T18:02:10Z

Json after on Mono machine is spending 15% of time in ThreadPool Dequeue

Bunch of spinning also

Might be Numa issue? (or lots of cores issue)

FYI @stephentoub @kouvel

benaadams · 2020-06-04T04:03:57Z

Yeah a large amount of the Mono machines time is spent in the synchnorisation primitves

benaadams · 2020-06-04T04:21:22Z

@adamsitnik maybe the Citrine traces would be better; the Mono traces are just showing things can't do much about (with this PR) alas

adamsitnik · 2020-06-04T12:13:08Z

Might be Numa issue?

Most probably these threads are waiting for more work.

maybe the Citrine traces would be better

Here you go :D

benaadams · 2020-06-05T04:44:32Z

Added API suggestion #37472

benaadams · 2020-08-02T15:04:52Z

Revist post 5.0

benaadams added 2 commits May 25, 2020 23:07

Pipelines remove lock from uncontended path

60f4232

Pipelines fast-path, return single block outside of lock

944976b

Dotnet-GitSync-Bot added the area-System.IO.Pipelines label May 25, 2020

benaadams mentioned this pull request May 27, 2020

If not waiting for data, allocate memory before checking anything dotnet/aspnetcore#22225

Closed

Check writer and read earlier to try to avoid data race

d6b9038

And add lots of comments

benaadams force-pushed the pipeline-contention-redux branch from 30f5e50 to d6b9038 Compare May 27, 2020 11:50

This was referenced May 27, 2020

Errors installing the SDK during builds #34015

Closed

OSX machines are de-provisioned during CI / PR runs leading to failures #34472

Closed

Return callback after returning block

7a29248

Add FlushAsync(bool isMoreData = false overload

22a25de

benaadams closed this Aug 2, 2020

ghost locked as resolved and limited conversation to collaborators Dec 9, 2020

Reduce Pipeline contention redux #36992

Reduce Pipeline contention redux #36992

Uh oh!

Conversation

benaadams commented May 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamsitnik commented May 26, 2020

Uh oh!

benaadams commented May 26, 2020

Uh oh!

benaadams commented May 26, 2020

Uh oh!

benaadams commented May 26, 2020

Uh oh!

benaadams commented May 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benaadams commented May 27, 2020

Uh oh!

adamsitnik commented May 27, 2020

Uh oh!

benaadams commented May 27, 2020

Uh oh!

adamsitnik commented May 27, 2020

Uh oh!

adamsitnik commented May 27, 2020

Uh oh!

benaadams commented May 27, 2020

Uh oh!

benaadams commented May 27, 2020

Uh oh!

benaadams commented May 27, 2020

Uh oh!

adamsitnik commented May 27, 2020

Uh oh!

adamsitnik commented May 27, 2020

Uh oh!

benaadams commented May 28, 2020

Uh oh!

benaadams commented May 28, 2020

Uh oh!

benaadams commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamsitnik commented May 29, 2020

Uh oh!

adamsitnik commented Jun 2, 2020

Uh oh!

benaadams commented Jun 2, 2020

Uh oh!

benaadams commented Jun 2, 2020

Uh oh!

adamsitnik commented Jun 2, 2020

Uh oh!

adamsitnik commented Jun 2, 2020

Uh oh!

sebastienros commented Jun 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fanyang-mono commented Jun 2, 2020

Uh oh!

adamsitnik commented Jun 2, 2020

Uh oh!

benaadams commented Jun 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benaadams commented Jun 4, 2020

Uh oh!

benaadams commented Jun 4, 2020

Uh oh!

adamsitnik commented Jun 4, 2020

Uh oh!

benaadams commented Jun 5, 2020

Uh oh!

benaadams commented Aug 2, 2020

Uh oh!

Uh oh!

benaadams commented May 25, 2020 •

edited

Loading

benaadams commented May 26, 2020 •

edited

Loading

benaadams commented May 28, 2020 •

edited

Loading

sebastienros commented Jun 2, 2020 •

edited

Loading

benaadams commented Jun 3, 2020 •

edited

Loading