Throughput tests for RDC and for CUDA 11.5#332
Merged
valassi merged 18 commits intomadgraph5:masterfrom Jan 17, 2022
Merged
Conversation
…erformance costs (or benefits?), issue madgraph5#51
…s fewer registers (170 instead of 172)
… 5% with fewer gpublocks and gputhreads)
… ggttgg - cuda slower by around 15%
… runtime with "invalid device symbol"
…fails at runtime with "invalid device symbol"" This reverts commit 76a7749.
…ails at runtimne with a separate HelAmps.o
Revert "[rdc] test that using no rdc option is equivalent to "-rdc false" - fails at runtimne with a separate HelAmps.o" This reverts commit 17d90a3. Revert "[rdc] test compiling HelAmps.cc as a separate object with rdc=true in ggttgg - cuda slower by around 15%" This reverts commit 3951a6e. Revert "[rdc] with '-rdc true' ggttgg is a bit slower in CUDA (1% at the top, 5% with fewer gpublocks and gputhreads)" This reverts commit 5ef2bf2. Revert "[rdc] with '-rdc true' ggtt has the same performance in cuda, but uses fewer registers (170 instead of 172)" This reverts commit 7617ad0. Revert "[rdc] with '-rdc true' eemumu is 20% faster in cuda! (128 registers instead of 130)" This reverts commit da08e56. Revert "[rdc] add '-rdc true' to CUFLAGS in eemumu, ggtt, ggttgg to reasses performance costs (or benefits?), issue madgraph5#51" This reverts commit e02c19a.
…but loses around 2% speed in cuda
Revert "[rdc] check that indeed removing __noinline__ gets back 2-5% cuda speed" This reverts commit 7f8dafe2297493319fcac8bda1615714d5ecceac. Revert "[rdc] check that using an explicit __noinline__ builds fast in nvcc, but loses around 2% speed in cuda" This reverts commit cf65e05089b1023e77d0bda037c2fd73b9633a33.
…ain 2-5% slower in CUDA
…d FFV_0 - recover around 1% in cuda (same in c++) HOWEVER the inline=1 build fails (FFVs are multiply defined in testxxx.cc)
Revert "[rdc] mixing inline and noinline builds with warnings in inline=1 - not a good idea" This reverts commit dbd1c9a. Revert "[rdc] test noinline for internal FFV, with optional inline for XXX and FFV_0 - recover around 1% in cuda (same in c++)" This reverts commit 8287210. Revert "[rdc] test using __noinline__ only for FFV functions but not XXX - again 2-5% slower in CUDA" This reverts commit ad7a376.
…s a 30% regression penalty vs 11.1 (madgraph5#282)
Revert "[rdc] test that with present master (before apiwf) cuda11.5 still gets a 30% regression penalty vs 11.1 (madgraph5#282)" This reverts commit 2285086.
Member
Author
|
Self merging - all files are identical to the previous master |
This was referenced Jan 17, 2022
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a zero-net-change PR with some tests for RDC and CUDA 11.5. I just want some commits to be in the history for the doc.
This was started as a helper to PR #328, in the middle of it. It includes