Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record hits without thread local variables #291

Merged
merged 1 commit into from
Jan 15, 2019

Conversation

sharwell
Copy link
Contributor

@sharwell sharwell commented Jan 13, 2019

On my machine (2990wx), this change appears to be good for a 25% performance increase. I measured performance by setting the number of iterations in TestPerformance to 1,000,000 and repeating the runs a few times using the instructions in the README file. The times before this change were 22.5 seconds, and after this change were 15.5 seconds.

Scenario Time (seconds, approx.)
No coverage 7.0
Empty RecordHit 8.9
Interlocked.Increment 15.5
Thread static 22.5

@codecov
Copy link

codecov bot commented Jan 13, 2019

Codecov Report

Merging #291 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #291   +/-   ##
=======================================
  Coverage   88.85%   88.85%           
=======================================
  Files          16       16           
  Lines        1875     1875           
=======================================
  Hits         1666     1666           
  Misses        209      209

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d4c51c3...6ae1a89. Read the comment docs.

@codemzs
Copy link

codemzs commented Jan 13, 2019

Looks great! @petli would be nice if we can also get your PR #276 in very soon as this is a major blocker for us. @tonerdo , can you please look at these two PRs soon and issue a new nuget?

@petli
Copy link
Collaborator

petli commented Jan 13, 2019 via email

@sharwell
Copy link
Contributor Author

I'm really interested to see what that issue was. Typically if something is going to have a problem during parallel execution, it will show up when running on my machine (64 logical processors).

@sharwell
Copy link
Contributor Author

I checked the history back to the creation of the template class, and didn't see anything related. I'll try to look back further today.

@sharwell
Copy link
Contributor Author

@petli I looked back in the source control history and determined that this feature was never implemented in the manner proposed by this pull request. The big change came in #181, but it went directly from using strings to using thread statics. If Interlocked.Increment was evaluated, it was not part of the obvious source control history working back from the current state.

@petli
Copy link
Collaborator

petli commented Jan 13, 2019

@sharwell I've been able to dig through the PRs now, the relevant comment is this one:
#241 (comment)

It is worthwhile to check if that is still true for CoreFx, if you can.

@sharwell
Copy link
Contributor Author

sharwell commented Jan 14, 2019

@petli From the comments in that thread, it sounds like interlocked wasn't a problem after all, and the original analysis was a false reading from combining the use of Interlocked.Increment with memory locations mapped to a file. I'm not sure how to perform specific tests outside the one mentioned above.

@petli
Copy link
Collaborator

petli commented Jan 14, 2019

@sharwell The thread as a whole was about memory maps, but that comment was the best source I could find for the discussion about interlocked in general. It refers to #181 where interlocked was first attempted (without any memory maps). It was replaced in 58b89a5 with the thread-local arrays.

@tonerdo
Copy link
Collaborator

tonerdo commented Jan 14, 2019

@codemzs I typically default to @petli for all multi-threading related functionality as he has much better experience with it. If he says this is good to go then I'm good

@petli
Copy link
Collaborator

petli commented Jan 14, 2019

I had a go at testing this PR with CoreFx, and outcome is that it improves some tests by the 25% seen on the basic performance test, and doesn't affect others much. Test execution time in seconds:

Project 1.4.0 PR
System.Collections.Concurrent.Tests 13 12
System.Collections.Tests 24 20
System.Reflection.Metadata.Tests 72 50
System.Xml.Linq.Events.Tests 160 158
System.Runtime.Serialization.Formatters.Tests 588 420

@pjanotti, do you think I was lucky in the selection of the tests I've run? Could the different ways of calling Interlocked (by generating IL code as in your first commit, vs calling it from RecordHit like here) make a difference?

@tonerdo @sharwell With these numbers I'm comfortable merging this, but keeping an eye on the CoreFx projects once this is released to ensure there's no degradation outside the small sample here.

@petli
Copy link
Collaborator

petli commented Jan 14, 2019

Btw, described how to run these tests in #292

@sharwell
Copy link
Contributor Author

@petli I would expect this change to show improvement primarily relative to the baseline (No coverage). In other words, for cases above where the difference between 1.4.0 and this PR was smaller, I would expect those same tests to show a smaller difference between the No coverage and coverage with 1.4.0 cases.

@ViktorHofer
Copy link
Contributor

Note that System.Private.CoreLib coverage isn't working right now and hits are not recorded. So make sure to choose projects that don't depend on it --> only test projects that don't have the TestRuntime property set to true.

System.Reflection.Metadata.Tests | 72 | 50 (correct)
System.Xml.Linq.Events.Tests | 160 | 158 (correct)

That said I think the PR clearly shows an improvement.

@sharwell
Copy link
Contributor Author

Rebased without significant changes to this PR to address the merge conflict from #286

@tonerdo
Copy link
Collaborator

tonerdo commented Jan 15, 2019

Note that System.Private.CoreLib coverage isn't working right now and hits are not recorded.

@ViktorHofer any particular reason why this is happening?

@tonerdo tonerdo merged commit e1b63ee into coverlet-coverage:master Jan 15, 2019
@sharwell sharwell deleted the interlocked-record branch January 15, 2019 15:45
private static int[] t_threadHits;

private static List<int[]> _threads;
private static bool s_isTracking;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late to the party, but is making this non-thread static safe?

Copy link
Contributor Author

@sharwell sharwell Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. There are only two cases where this flag is used:

  1. The static constructor, which the CLR guarantees will complete on exactly one thread.
  2. The UnloadModule handler, after which point the value will always be true.

By design, there are no more thread static accesses on the RecordHit hot path.

@ViktorHofer
Copy link
Contributor

@tonerdo The instrumented System.Private.CoreLib.dll throws a StackOverflowException. This very painful for us in corefx but unfortunately I couldn't find the issue in the instrumentation yet. If you want to offer help I can quickly write up steps to reproduce the issue. I would be thrilled.

@ViktorHofer
Copy link
Contributor

Failed to request MethodData, not in JIT code range
MethodDesc:   00007ffe8f19b9c0
Method Name:          Coverlet.Core.Instrumentation.Tracker.System.Private.CoreLib_c64a6f40-c11a-4e9c-b1eb-86506d4b4c35.RecordHit(Int32)
Class:                00007ffe8f1a2980
MethodTable:          00007ffe8f19b9f0
mdToken:              00000000060052c9
Module:               00007ffe8eff6020
IsJitted:             yes
Current CodeAddr:     00007ffe8f116510
Code Version History:
  CodeAddr:           00007ffe8f116510  (Tier 0)
  NativeCodeVersion:  0000000000000000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants