Skip to content

Remove helicity filtering from cudacpp ME timers in (c/g)madevent#463

Merged
valassi merged 6 commits intomadgraph5:masterfrom
valassi:fvsc
May 22, 2022
Merged

Remove helicity filtering from cudacpp ME timers in (c/g)madevent#463
valassi merged 6 commits intomadgraph5:masterfrom
valassi:fvsc

Conversation

@valassi
Copy link
Member

@valassi valassi commented May 22, 2022

This is a PR addressing #461 - I observed a difference in cudacpp ME throughputs when computed in standalone mode (even in bridge mode) and within cmadevent/gmadevent.

The largest part of the difference (a factor 2, because I only do one iteration in these tests) is due to the fact that I was not removing helicity filtering in the cudacpp timers. So Fortran MEs includes only one ME calculation, while cudacpp includes two (the first one for helicity filtering, the second one for real ME calculation).

This is now fixed in this PR.

valassi added 6 commits May 22, 2022 13:58
…esses after removing helicity filtering (madgraph5#461) - much better

Difference between cudacpp ME performance in madevent and in SA/bridge is within 10% for complex processes.

For eemumu and ggtt the difference is still quite large instead.
This is now understood: madevent throughput is lower because it includes data copies/transposing
(in the SA "bridge" mode the timers really only compute the ME calculation)
@valassi
Copy link
Member Author

valassi commented May 22, 2022

All tests have passed, I am self merging.

I closed issue #461. Note that I opened #464 to further improve timers in SA bridge mode, but I consider that really low priority.

@valassi valassi merged commit 1b7bb8f into madgraph5:master May 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Understand performance difference in cudacpp between SA and madevent

1 participant