-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Record hits without thread local variables #291
Conversation
Codecov Report
@@ Coverage Diff @@
## master #291 +/- ##
=======================================
Coverage 88.85% 88.85%
=======================================
Files 16 16
Lines 1875 1875
=======================================
Hits 1666 1666
Misses 209 209 Continue to review full report at Codecov.
|
There was a problem using interlocking that slowed down bigger tests that was the reason to use thread local variables. It especially affected CoreLib, if I remember correctly. Look at the git blame log and it should show up the PR and discussion. (I can't find that right now as I'm just replying from my phone).
… On 13 Jan 2019, at 04:36, Zeeshan Siddiqui ***@***.***> wrote:
Looks great! @petli would be nice if we can also get your PR #276 in very soon as this is a major blocker for us. @tonerdo , can you please look at these two PRs soon and issue a new nuget?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I'm really interested to see what that issue was. Typically if something is going to have a problem during parallel execution, it will show up when running on my machine (64 logical processors). |
I checked the history back to the creation of the template class, and didn't see anything related. I'll try to look back further today. |
@petli I looked back in the source control history and determined that this feature was never implemented in the manner proposed by this pull request. The big change came in #181, but it went directly from using strings to using thread statics. If |
@sharwell I've been able to dig through the PRs now, the relevant comment is this one: It is worthwhile to check if that is still true for CoreFx, if you can. |
@petli From the comments in that thread, it sounds like interlocked wasn't a problem after all, and the original analysis was a false reading from combining the use of Interlocked.Increment with memory locations mapped to a file. I'm not sure how to perform specific tests outside the one mentioned above. |
I had a go at testing this PR with CoreFx, and outcome is that it improves some tests by the 25% seen on the basic performance test, and doesn't affect others much. Test execution time in seconds:
@pjanotti, do you think I was lucky in the selection of the tests I've run? Could the different ways of calling Interlocked (by generating IL code as in your first commit, vs calling it from RecordHit like here) make a difference? @tonerdo @sharwell With these numbers I'm comfortable merging this, but keeping an eye on the CoreFx projects once this is released to ensure there's no degradation outside the small sample here. |
Btw, described how to run these tests in #292 |
@petli I would expect this change to show improvement primarily relative to the baseline (No coverage). In other words, for cases above where the difference between 1.4.0 and this PR was smaller, I would expect those same tests to show a smaller difference between the No coverage and coverage with 1.4.0 cases. |
Note that System.Private.CoreLib coverage isn't working right now and hits are not recorded. So make sure to choose projects that don't depend on it --> only test projects that don't have the System.Reflection.Metadata.Tests | 72 | 50 (correct) That said I think the PR clearly shows an improvement. |
0326e36
to
6ae1a89
Compare
Rebased without significant changes to this PR to address the merge conflict from #286 |
@ViktorHofer any particular reason why this is happening? |
private static int[] t_threadHits; | ||
|
||
private static List<int[]> _threads; | ||
private static bool s_isTracking; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late to the party, but is making this non-thread static safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. There are only two cases where this flag is used:
- The static constructor, which the CLR guarantees will complete on exactly one thread.
- The
UnloadModule
handler, after which point the value will always be true.
By design, there are no more thread static accesses on the RecordHit
hot path.
@tonerdo The instrumented System.Private.CoreLib.dll throws a StackOverflowException. This very painful for us in corefx but unfortunately I couldn't find the issue in the instrumentation yet. If you want to offer help I can quickly write up steps to reproduce the issue. I would be thrilled. |
|
On my machine (2990wx), this change appears to be good for a 25% performance increase. I measured performance by setting the number of iterations in
TestPerformance
to 1,000,000 and repeating the runs a few times using the instructions in the README file. The times before this change were 22.5 seconds, and after this change were 15.5 seconds.RecordHit
Interlocked.Increment