Skip to content

Improved handling of hardcoded parameters using the cxsmpl class#359

Merged
valassi merged 23 commits intomadgraph5:masterfrom
valassi:hrdcod
Jan 27, 2022
Merged

Improved handling of hardcoded parameters using the cxsmpl class#359
valassi merged 23 commits intomadgraph5:masterfrom
valassi:hrdcod

Conversation

@valassi
Copy link
Member

@valassi valassi commented Jan 26, 2022

This is a WIP PR implementing #353 (improved handling of hardcoded parameters using the cxsmpl class).

The rationale is that hardcoding parameters gets a significant runtime boost for cuda and c++.

This is a followup to PR #306. There I had done a first proof of concept, but the full automatic generation of hardcoded parameters was not possible, because some complex parameters are computed in the header, and this requires constexpr complex arithmetics which is not there in std::complex.

This is now possible because

There is now a full proof of concept for all Parameters ardcoded in Parameters.h. This is WIP because I still need to move this to code generation (and try it out on different models, even including EFT for instance).

@valassi valassi marked this pull request as draft January 26, 2022 18:51
@valassi valassi self-assigned this Jan 26, 2022
@valassi
Copy link
Member Author

valassi commented Jan 26, 2022

PS The performance of all the code (in both hardcoded and non hardcoded) must also be carefully reevaluated. There are quite a few more tempolated functons now, but they were probably inlined before. And some references have now become passing by value. To be assessed.

…RDCOD=1 - a bit better, but not that different
…=1 - a bit better, but not that different

(Note: building from scratch 'make -j' and running tests took 1h30... essentially ~30+ minutes for each gCPPProcess.o)
STARTED AT Thu Jan 27 12:47:54 CET 2022
ENDED   AT Thu Jan 27 14:00:03 CET 2022
…lts: not very different, keep hrd0 as default
@valassi valassi changed the title WIP - Improved handling of hardcoded parameters using the cxsmpl class Improved handling of hardcoded parameters using the cxsmpl class Jan 27, 2022
@valassi valassi marked this pull request as ready for review January 27, 2022 15:16
@valassi
Copy link
Member Author

valassi commented Jan 27, 2022

This is now complete.

It is hard to say whether there is a performance advantage at all. (For the less complex processes, it seems that not hardcoding is faster...). Maybe, the more complex the process, the more interesting it is to have hardcoded parameters. This is a summary table
https://github.com/madgraph5/madgraph4gpu/blob/b71ac10c84f1f92e4208d7f7ac65360b2824826c/epochX/cudacpp/tput/summaryTable_hrdcod.txt

*** FPTYPE=d ******************************************************************

Revision 2938acb [nvcc 11.6.55 (gcc 10.2.0)] 
HELINL=0 HRDCOD=0
            eemumu      ggtt        ggttg       ggttgg      ggttggg     
CUD/none    1.33e+09    1.43e+08    1.44e+07    5.20e+05    1.18e+04    
CPP/none    1.66e+06    2.01e+05    2.48e+04    1.80e+03    7.25e+01    
CPP/sse4    3.12e+06    3.18e+05    4.53e+04    3.38e+03    1.31e+02    
CPP/avx2    5.49e+06    5.70e+05    8.85e+04    6.84e+03    2.62e+02    
CPP/512y    5.86e+06    6.11e+05    9.82e+04    7.52e+03    2.85e+02    
CPP/512z    4.80e+06    3.77e+05    7.21e+04    6.55e+03    2.95e+02    

Revision 2938acb [nvcc 11.6.55 (gcc 10.2.0)] 
HELINL=0 HRDCOD=1
            eemumu      ggtt        ggttg       ggttgg      ggttggg     
CUD/none    1.34e+09    1.39e+08    1.46e+07    5.15e+05    1.20e+04    
CPP/none    1.67e+06    2.05e+05    2.65e+04    1.89e+03    7.54e+01    
CPP/sse4    3.12e+06    3.30e+05    4.75e+04    3.44e+03    1.36e+02    
CPP/avx2    5.14e+06    4.61e+05    8.37e+04    6.89e+03    2.72e+02    
CPP/512y    5.85e+06    4.68e+05    9.13e+04    7.67e+03    3.00e+02    
CPP/512z    1.04e+07    3.42e+05    7.03e+04    6.59e+03    3.02e+02   

In any case there is a slight reduction of register pressure in CUDA (not show here, see the logfiles.

Keep the hrdcod=0 as default.

I am self merging this anyway, as it does not harm, the hardcoded parameters are one option. (The only thing that changed is that the cxsmpl class is used for parameters always, even if parameters are not hardcoded: not a problem IMO, but can be easily changed if needed).

@valassi
Copy link
Member Author

valassi commented Jan 27, 2022

All tests pass, I am self merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant