Skip to content

Throughput tests for RDC and for CUDA 11.5#332

Merged
valassi merged 18 commits intomadgraph5:masterfrom
valassi:rdc
Jan 17, 2022
Merged

Throughput tests for RDC and for CUDA 11.5#332
valassi merged 18 commits intomadgraph5:masterfrom
valassi:rdc

Conversation

@valassi
Copy link
Member

@valassi valassi commented Jan 17, 2022

This is a zero-net-change PR with some tests for RDC and CUDA 11.5. I just want some commits to be in the history for the doc.

This was started as a helper to PR #328, in the middle of it. It includes

…fails at runtime with "invalid device symbol""

This reverts commit 76a7749.
Revert "[rdc] test that using no rdc option is equivalent to "-rdc false" - fails at runtimne with a separate HelAmps.o"
This reverts commit 17d90a3.

Revert "[rdc] test compiling HelAmps.cc as a separate object with rdc=true in ggttgg - cuda slower by around 15%"
This reverts commit 3951a6e.

Revert "[rdc] with '-rdc true' ggttgg is a bit slower in CUDA (1% at the top, 5% with fewer gpublocks and gputhreads)"
This reverts commit 5ef2bf2.

Revert "[rdc] with '-rdc true' ggtt has the same performance in cuda, but uses fewer registers (170 instead of 172)"
This reverts commit 7617ad0.

Revert "[rdc] with '-rdc true' eemumu is 20% faster in cuda! (128 registers instead of 130)"
This reverts commit da08e56.

Revert "[rdc] add '-rdc true' to CUFLAGS in eemumu, ggtt, ggttgg to reasses performance costs (or benefits?), issue madgraph5#51"
This reverts commit e02c19a.
Revert "[rdc] check that indeed removing __noinline__ gets back 2-5% cuda speed"
This reverts commit 7f8dafe2297493319fcac8bda1615714d5ecceac.

Revert "[rdc] check that using an explicit __noinline__ builds fast in nvcc, but loses around 2% speed in cuda"
This reverts commit cf65e05089b1023e77d0bda037c2fd73b9633a33.
…d FFV_0 - recover around 1% in cuda (same in c++)

HOWEVER the inline=1 build fails (FFVs are multiply defined in testxxx.cc)
Revert "[rdc] mixing inline and noinline builds with warnings in inline=1 - not a good idea"
This reverts commit dbd1c9a.

Revert "[rdc] test noinline for internal FFV, with optional inline for XXX and FFV_0 - recover around 1% in cuda (same in c++)"
This reverts commit 8287210.

Revert "[rdc] test using __noinline__ only for FFV functions but not XXX - again 2-5% slower in CUDA"
This reverts commit ad7a376.
Revert "[rdc] test that with present master (before apiwf) cuda11.5 still gets a 30% regression penalty vs 11.1 (madgraph5#282)"
This reverts commit 2285086.
@valassi valassi self-assigned this Jan 17, 2022
@valassi
Copy link
Member Author

valassi commented Jan 17, 2022

Self merging - all files are identical to the previous master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant