Write barrier optimizations for ARM64 Windows #22003

adityamandaleeka · 2019-01-16T03:35:06Z

This change is a step towards unification of the ARM64 write barrier logic between Windows and Unix. It brings over some of the changes that were done for Unix in #12334 such as using a literal pool to hold heap location/geometry information used in the barriers.

Parts of the code have been tweaked in pursuit of performance gains.

The cmp+ccmp pair has been replaced with two discrete cmp/branches to detangle it and allow for a faster exit if the lower bound check fails.
The ephemeral bounds are now being loaded simultaneously with an ldp rather than separately prior to each compare.
The shift for indexing into the card table has been separated out into its own instruction.

Sampling a write barrier-heavy test after these changes shows a ~7-12% decrease in the time spent in the barrier relative to the current, post-optimization version that's in use on Unix today.

Once this is in, I plan to port the deltas over to Unix so that the barriers will be in sync. I left the CLR writewatch and manually managed card bundles stuff alone on Windows for now since it's not enabled yet, but I'll likely experiment with those in the near future and check in the remaining pieces after doing so.

davidwrighton

This looks correct to me. Nice perf wins discovered too.

AndyAyersMS · 2019-01-16T20:39:06Z

The cmp+ccmp pair has been replaced with two discrete cmp/branches to detangle it and allow for a faster exit if the lower bound check fails.

It looks like you changed from two compare/branches to two compares and one branch, not the other way round...

adityamandaleeka · 2019-01-16T20:57:37Z

@AndyAyersMS Sorry for the confusion. My comment was in reference to the ways my change digresses from the optimizations done on ARM64 Unix. With that in mind, the cmp+ccmp in JIT_WriteBarrier has been replaced with the two discrete cmps/branches.

The checked write barrier, where the cmp+ccmp thing was also done, has not been modified in this way. It will be interesting to get data about whether the address tends to be within the bounds of the heap, but until then I'll leave it alone.

sdmaclea

LGTM Thanks

adityamandaleeka · 2019-01-16T22:49:29Z

@dotnet-bot test Windows_NT arm64 Cross Checked Innerloop Build and Test

janvorli

LGTM, thank you!

janvorli · 2019-01-16T09:16:14Z

src/vm/arm64/asmhelpers.asm

@@ -366,57 +466,58 @@ NotInHeap
        ; if ([x14] == x15) goto end
        ldr      x13, [x14]
        cmp      x13, x15
-        beq shadowupdateend
+        beq ShadowUpdateEnd


A nit - can you please fix alignment of the label?

adityamandaleeka · 2019-01-22T22:05:55Z

@dotnet-bot test Windows_NT arm64 Cross Checked Innerloop Build and Test

…rrier_updates_arm64 Write barrier optimizations for ARM64 Windows Commit migrated from dotnet/coreclr@9fb7676

adityamandaleeka added 4 commits January 15, 2019 18:50

Bringing Unix write barrier changes to Windows.

69bc0b8

Switch back to two compares.

e604fad

Load ephemeral bounds simultaneously with ldp.

8fad162

Use separate shift and indexing.

e72b3fb

adityamandaleeka requested review from janvorli, davidwrighton and sdmaclea January 16, 2019 03:35

davidwrighton approved these changes Jan 16, 2019

View reviewed changes

sdmaclea approved these changes Jan 16, 2019

View reviewed changes

janvorli approved these changes Jan 22, 2019

View reviewed changes

adityamandaleeka merged commit 9fb7676 into dotnet:master Jan 24, 2019

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022

Merge pull request dotnet/coreclr#22003 from adityamandaleeka/writeba…

9642e18

…rrier_updates_arm64 Write barrier optimizations for ARM64 Windows Commit migrated from dotnet/coreclr@9fb7676

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Write barrier optimizations for ARM64 Windows #22003

Write barrier optimizations for ARM64 Windows #22003

Uh oh!

adityamandaleeka commented Jan 16, 2019

Uh oh!

davidwrighton left a comment

Uh oh!

AndyAyersMS commented Jan 16, 2019

Uh oh!

adityamandaleeka commented Jan 16, 2019

Uh oh!

sdmaclea left a comment

Uh oh!

adityamandaleeka commented Jan 16, 2019

Uh oh!

janvorli left a comment

Uh oh!

janvorli Jan 16, 2019

Uh oh!

adityamandaleeka commented Jan 22, 2019

Uh oh!

Uh oh!

Write barrier optimizations for ARM64 Windows #22003

Write barrier optimizations for ARM64 Windows #22003

Uh oh!

Conversation

adityamandaleeka commented Jan 16, 2019

Uh oh!

davidwrighton left a comment

Choose a reason for hiding this comment

Uh oh!

AndyAyersMS commented Jan 16, 2019

Uh oh!

adityamandaleeka commented Jan 16, 2019

Uh oh!

sdmaclea left a comment

Choose a reason for hiding this comment

Uh oh!

adityamandaleeka commented Jan 16, 2019

Uh oh!

janvorli left a comment

Choose a reason for hiding this comment

Uh oh!

janvorli Jan 16, 2019

Choose a reason for hiding this comment

Uh oh!

adityamandaleeka commented Jan 22, 2019

Uh oh!

Uh oh!