Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
98 commits
Select commit Hold shift + click to select a range
1f263f6
change python interface to handle two parameter related to vectorizat…
oliviermattelaer Sep 14, 2023
bc994a3
change submodule to support lhapdf
oliviermattelaer Sep 22, 2023
55b3e74
remove the patch for autodsig.f (need an upstream patch from wrap_int…
oliviermattelaer Oct 5, 2023
0164294
change name to get (the first) element of the list of channel
oliviermattelaer Oct 6, 2023
d1d25f4
merge with setup code of Andrea
oliviermattelaer Nov 8, 2023
b284dd2
change python interface to handle two parameter related to vectorizat…
oliviermattelaer Sep 14, 2023
ff9a3d5
update submodule
roiser Nov 8, 2023
977864b
change python interface to handle two parameter related to vectorizat…
roiser Nov 16, 2023
664875b
move submodule commit
roiser Nov 20, 2023
2ec5b01
Merge branch 'master' into new_interface_wrap
roiser Nov 20, 2023
b2c3e03
initial commit for changing g g > t t~ code
roiser Dec 6, 2023
580a742
regenerate code for g g > t t~ with generateAndCompare
roiser Dec 7, 2023
7b8c25e
regenerate g g > t t~ from madgraph prompt (not generateAndCompare)
roiser Dec 7, 2023
882d31e
first set of changes
roiser Dec 7, 2023
e6dd7ab
after first compile
roiser Dec 7, 2023
b13f382
it compiles, but still uses the fixed channelId, though its an array …
roiser Dec 7, 2023
218bd18
add memory access header for channel ids
roiser Jan 12, 2024
d5a5537
add unsigned long vector type
roiser Jan 12, 2024
d67efb1
move to unsigned int
roiser Jan 13, 2024
5af41d7
add channel id access
roiser Jan 13, 2024
244e34c
fix class name
roiser Jan 13, 2024
dc2324d
adapt channel ids access for unsigned int
roiser Jan 13, 2024
57a15da
change submodule ref
roiser Jan 13, 2024
b375fc0
change mask to fptype vector, does this work?
roiser Jan 13, 2024
68aadcf
introduce a iventAccessRecordConst overload in the MemoryHelper and s…
roiser Jan 13, 2024
846aee1
introduce a decodeRecordConst overload in the MemoryHelper and static…
roiser Jan 13, 2024
c9afb6f
introduce a ieventAccessField overload in the MemoryHelper and static…
roiser Jan 13, 2024
03d1b20
fix remaining overloads
roiser Jan 13, 2024
7918c97
add remaining channelIds
roiser Jan 13, 2024
a1e26ff
reactivate me calculations
roiser Jan 13, 2024
e7c3871
cleanup
roiser Jan 13, 2024
6ae356e
fix space
roiser Jan 15, 2024
819693a
get back original code, why was this lost?
roiser Jan 15, 2024
d6f4e1e
first changes for color choice
roiser Jan 15, 2024
622db1f
simplify simd handling
roiser Jan 26, 2024
9367fe8
move to unsigned int for now
roiser Feb 1, 2024
5684633
random color selection
roiser Feb 1, 2024
860af13
implement only CUDA for now
roiser Feb 5, 2024
1274850
fix vec/wrap size
roiser Feb 5, 2024
b317cbf
change to CUDA
roiser Feb 5, 2024
961748e
merge with latest master
oliviermattelaer Feb 6, 2024
9599788
first version of SIMD loop
roiser Feb 9, 2024
a02f1d4
merge with upstream
roiser Feb 9, 2024
cbe30f9
bump submodule
roiser Feb 13, 2024
3e79a97
change check of channelIds (good helicities iteration)
roiser Feb 15, 2024
78d9619
copy channelIds only before matrix elements, not needed before
roiser Feb 15, 2024
c0f7f61
merge master into new_interface_wrap
roiser Feb 19, 2024
df0fd9e
check channelIds in case of good hels
roiser Feb 20, 2024
e15d814
remove overloads and add template parameter for the helper class instead
roiser Feb 27, 2024
c4933c2
remove static_cast s which are not needed anymore
roiser Feb 27, 2024
dc96e71
fix formatting
roiser Feb 27, 2024
cd6195e
remove/reduce comment
roiser Mar 4, 2024
f9b464a
remove the patch of auto_dsig1.f, it is now in the code generator
roiser Mar 6, 2024
535f238
move the plugin hash
roiser Mar 6, 2024
a1d4b59
first version ch ids codegen
roiser Mar 4, 2024
c64e07e
some missing bits for the codegen
roiser Mar 15, 2024
9d2a8f9
pass nullptr for good helicities
roiser Mar 15, 2024
41303b5
pass nullptr for good helicities
roiser Mar 15, 2024
535b348
fix for final calculations, skip over it for first run
roiser Mar 19, 2024
d8376d0
change the comparison, it's more clear that this in array now
roiser Mar 21, 2024
8bbd4a1
change the comparison, it's more clear that this in array now
roiser Mar 21, 2024
549f000
make sure that for cuda the channelIds are set to 0, cudaMallocHost d…
roiser Mar 26, 2024
66b4bdf
modifications for clang format
roiser Mar 27, 2024
ed0aafc
modifications for formatting (madevent)
roiser Mar 27, 2024
1ef9a7f
fixing the point that madevent_gpu was not calling the correct finali…
oliviermattelaer Apr 8, 2024
5433267
allow back gtest
oliviermattelaer Apr 8, 2024
070687a
fix CPU scalar version
roiser May 3, 2024
c6836bb
trigger ci
roiser May 4, 2024
db21029
Merge branch 'master' into new_interface_wrap
roiser May 4, 2024
008d9f9
fix google tests
roiser May 4, 2024
ea2ba77
regenerate gg_tt.mad in repo, CI mad tests will fail
roiser May 5, 2024
b2192d4
check channelid values inside wavefunctions
roiser May 6, 2024
f717dc4
Revert "check channelid values inside wavefunctions"
roiser May 6, 2024
9e935fd
check array value against 0 instead of array address
roiser May 28, 2024
845b602
check array value
roiser May 28, 2024
21fa29e
merge master
roiser May 28, 2024
5ba6803
remove auto_dsig1.f
roiser May 28, 2024
9b92636
fix constructor arguments
roiser May 29, 2024
fe467b5
HACK, avoid division by 0
roiser May 29, 2024
19dd506
fix formatting
roiser May 29, 2024
49bcf53
change submodule branch name
roiser May 30, 2024
b21ce18
Merge remote-tracking branch 'origin/master' into new_interface_wrap
roiser Jun 3, 2024
cda35f1
trigger CI
roiser Jun 5, 2024
a440a7d
Merge remote-tracking branch 'upstream/master' into new_interface_wrap
roiser Jun 5, 2024
a791e8f
add one susy process
roiser Jun 6, 2024
1138257
trigger CI
roiser Jun 6, 2024
1478452
disable restore cache
roiser Jun 6, 2024
fe10835
trigger CI 2
roiser Jun 6, 2024
7ffd2a5
enable again caches in the CI
roiser Jun 6, 2024
ac89492
Merge remote-tracking branch 'upstream/master_june24' into new_interf…
roiser Jun 7, 2024
1430883
fix iconfigC
roiser Jun 7, 2024
5c0143f
move submodule
roiser Jun 7, 2024
66c2487
fix formatting
roiser Jun 7, 2024
e14efb1
fix formatting
roiser Jun 7, 2024
e5bc08f
trigger CI
roiser Jun 7, 2024
5c61b10
trigger CI
roiser Jun 7, 2024
c2b174b
add master_june24 as branch for CI
roiser Jun 7, 2024
78aad4a
add branch master_june24 for CI
roiser Jun 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/c-cpp.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@ name: C/C++ CI

on:
push:
branches: [ master ]
branches: [ master, master_june24 ]
pull_request:
branches: [ master ]
branches: [ master, master_june24 ]

jobs:
debug_builds:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/testsuite_allprocesses.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ on:

# Trigger the all-processes workflow for pull requests to master
pull_request:
branches: [ master ]
branches: [ master, master_june24 ]

# Trigger the all-processes workflow when new changes to the workflow are pushed
# (NB: this is now disabled to avoid triggering two jobs when pushing to a branch for which a PR is opened)
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/testsuite_oneprocess.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ on:
###type: string
type: choice
# FIXME? Can the list of supported processes be specified only once in oneprocess.yml or allprocesses.yml?
options: [gg_tt.mad, gg_ttg.mad, gg_ttgg.mad, gg_ttggg.mad, ee_mumu.mad, nobm_pp_ttW.mad]
options: [gg_tt.mad, gg_ttg.mad, gg_ttgg.mad, gg_ttggg.mad, ee_mumu.mad, nobm_pp_ttW.mad, susy_gg_tt.mad]

#----------------------------------------------------------------------------------------------------------------------------------

Expand Down
2 changes: 1 addition & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[submodule "MG5aMC/mg5amcnlo"]
path = MG5aMC/mg5amcnlo
url = https://github.com/mg5amcnlo/mg5amcnlo
branch = gpucpp
branch = gpucpp_wrap
2 changes: 1 addition & 1 deletion MG5aMC/mg5amcnlo
Submodule mg5amcnlo updated 1998 files
Original file line number Diff line number Diff line change
@@ -1,161 +1,3 @@
diff --git b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f
index 4fbb8e6ba..f9e2335de 100644
--- b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f
+++ a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f
@@ -484,23 +484,140 @@ C
INTEGER VECSIZE_USED

INTEGER IVEC
-
-
+ INTEGER IEXT
+
+ INTEGER ISUM_HEL
+ LOGICAL MULTI_CHANNEL
+ COMMON/TO_MATRIX/ISUM_HEL, MULTI_CHANNEL
+
+ LOGICAL FIRST_CHID
+ SAVE FIRST_CHID
+ DATA FIRST_CHID/.TRUE./
+
+#ifdef MG5AMC_MEEXPORTER_CUDACPP
+ INCLUDE 'coupl.inc' ! for ALL_G
+ INCLUDE 'fbridge.inc'
+ INCLUDE 'fbridge_common.inc'
+ INCLUDE 'genps.inc'
+ INCLUDE 'run.inc'
+ DOUBLE PRECISION OUT2(VECSIZE_MEMMAX)
+ INTEGER SELECTED_HEL2(VECSIZE_MEMMAX)
+ INTEGER SELECTED_COL2(VECSIZE_MEMMAX)
+ DOUBLE PRECISION CBYF1
+ INTEGER*4 NGOODHEL, NTOTHEL
+
+ INTEGER*4 NWARNINGS
+ SAVE NWARNINGS
+ DATA NWARNINGS/0/
+
+ LOGICAL FIRST
+ SAVE FIRST
+ DATA FIRST/.TRUE./
+
+ IF( FBRIDGE_MODE .LE. 0 ) THEN ! (FortranOnly=0 or BothQuiet=-1 or BothDebug=-2)
+#endif
+ call counters_smatrix1multi_start( -1, VECSIZE_USED ) ! fortran=-1
!$OMP PARALLEL
!$OMP DO
- DO IVEC=1, VECSIZE_USED
- CALL SMATRIX1(P_MULTI(0,1,IVEC),
- & hel_rand(IVEC),
- & col_rand(IVEC),
- & channel,
- & IVEC,
- & out(IVEC),
- & selected_hel(IVEC),
- & selected_col(IVEC)
- & )
- ENDDO
+ DO IVEC=1, VECSIZE_USED
+ CALL SMATRIX1(P_MULTI(0,1,IVEC),
+ & hel_rand(IVEC),
+ & col_rand(IVEC),
+ & channel,
+ & IVEC,
+ & out(IVEC),
+ & selected_hel(IVEC),
+ & selected_col(IVEC)
+ & )
+ ENDDO
!$OMP END DO
!$OMP END PARALLEL
+ call counters_smatrix1multi_stop( -1 ) ! fortran=-1
+#ifdef MG5AMC_MEEXPORTER_CUDACPP
+ ENDIF
+
+ IF( FBRIDGE_MODE .EQ. 1 .OR. FBRIDGE_MODE .LT. 0 ) THEN ! (CppOnly=1 or BothQuiet=-1 or BothDebug=-2)
+ IF( LIMHEL.NE.0 ) THEN
+ WRITE(6,*) 'ERROR! The cudacpp bridge only supports LIMHEL=0'
+ STOP
+ ENDIF
+ IF ( FIRST ) THEN ! exclude first pass (helicity filtering) from timers (#461)
+ CALL FBRIDGESEQUENCE_NOMULTICHANNEL( FBRIDGE_PBRIDGE, ! multi channel disabled for helicity filtering
+ & P_MULTI, ALL_G, HEL_RAND, COL_RAND, OUT2,
+ & SELECTED_HEL2, SELECTED_COL2 )
+ FIRST = .FALSE.
+c ! This is a workaround for https://github.com/oliviermattelaer/mg5amc_test/issues/22 (see PR #486)
+ IF( FBRIDGE_MODE .EQ. 1 ) THEN ! (CppOnly=1 : SMATRIX1 is not called at all)
+ CALL RESET_CUMULATIVE_VARIABLE() ! mimic 'avoid bias of the initialization' within SMATRIX1
+ ENDIF
+ CALL FBRIDGEGETNGOODHEL(FBRIDGE_PBRIDGE,NGOODHEL,NTOTHEL)
+ IF( NTOTHEL .NE. NCOMB ) THEN
+ WRITE(6,*) 'ERROR! Cudacpp/Fortran mismatch',
+ & ' in total number of helicities', NTOTHEL, NCOMB
+ STOP
+ ENDIF
+ WRITE (6,*) 'NGOODHEL =', NGOODHEL
+ WRITE (6,*) 'NCOMB =', NCOMB
+ ENDIF
+ call counters_smatrix1multi_start( 0, VECSIZE_USED ) ! cudacpp=0
+ IF ( .NOT. MULTI_CHANNEL ) THEN
+ CALL FBRIDGESEQUENCE_NOMULTICHANNEL( FBRIDGE_PBRIDGE, ! multi channel disabled
+ & P_MULTI, ALL_G, HEL_RAND, COL_RAND, OUT2,
+ & SELECTED_HEL2, SELECTED_COL2 )
+ ELSE
+ IF( SDE_STRAT.NE.1 ) THEN
+ WRITE(6,*) 'ERROR! The cudacpp bridge requires SDE=1' ! multi channel single-diagram enhancement strategy
+ STOP
+ ENDIF
+ CALL FBRIDGESEQUENCE(FBRIDGE_PBRIDGE, P_MULTI, ALL_G,
+ & HEL_RAND, COL_RAND, CHANNEL, OUT2,
+ & SELECTED_HEL2, SELECTED_COL2 ) ! 1-N: multi channel enabled
+ ENDIF
+ call counters_smatrix1multi_stop( 0 ) ! cudacpp=0
+ ENDIF
+
+ IF( FBRIDGE_MODE .LT. 0 ) THEN ! (BothQuiet=-1 or BothDebug=-2)
+ DO IVEC=1, VECSIZE_USED
+ CBYF1 = OUT2(IVEC)/OUT(IVEC) - 1
+ FBRIDGE_NCBYF1 = FBRIDGE_NCBYF1 + 1
+ FBRIDGE_CBYF1SUM = FBRIDGE_CBYF1SUM + CBYF1
+ FBRIDGE_CBYF1SUM2 = FBRIDGE_CBYF1SUM2 + CBYF1 * CBYF1
+ IF( CBYF1 .GT. FBRIDGE_CBYF1MAX ) FBRIDGE_CBYF1MAX = CBYF1
+ IF( CBYF1 .LT. FBRIDGE_CBYF1MIN ) FBRIDGE_CBYF1MIN = CBYF1
+ IF( FBRIDGE_MODE .EQ. -2 ) THEN ! (BothDebug=-2)
+ WRITE (*,'(I4,2E16.8,F23.11,I3,I3,I4,I4)')
+ & IVEC, OUT(IVEC), OUT2(IVEC), 1+CBYF1,
+ & SELECTED_HEL(IVEC), SELECTED_HEL2(IVEC),
+ & SELECTED_COL(IVEC), SELECTED_COL2(IVEC)
+ ENDIF
+ IF( ABS(CBYF1).GT.5E-5 .AND. NWARNINGS.LT.20 ) THEN
+ NWARNINGS = NWARNINGS + 1
+ WRITE (*,'(A,I4,A,I4,2E16.8,F23.11)')
+ & 'WARNING! (', NWARNINGS, '/20) Deviation more than 5E-5',
+ & IVEC, OUT(IVEC), OUT2(IVEC), 1+CBYF1
+ ENDIF
+ END DO
+ ENDIF
+
+ IF( FBRIDGE_MODE .EQ. 1 .OR. FBRIDGE_MODE .LT. 0 ) THEN ! (CppOnly=1 or BothQuiet=-1 or BothDebug=-2)
+ DO IVEC=1, VECSIZE_USED
+ OUT(IVEC) = OUT2(IVEC) ! use the cudacpp ME instead of the fortran ME!
+ SELECTED_HEL(IVEC) = SELECTED_HEL2(IVEC) ! use the cudacpp helicity instead of the fortran helicity!
+ SELECTED_COL(IVEC) = SELECTED_COL2(IVEC) ! use the cudacpp color instead of the fortran color!
+ END DO
+ ENDIF
+#endif
+
+ IF ( FIRST_CHID ) THEN
+ IF ( MULTI_CHANNEL ) THEN
+ WRITE (*,*) 'MULTI_CHANNEL = TRUE'
+ ELSE
+ WRITE (*,*) 'MULTI_CHANNEL = FALSE'
+ ENDIF
+ WRITE (*,*) 'CHANNEL_ID =', CHANNEL
+ FIRST_CHID = .FALSE.
+ ENDIF
+
RETURN
END

diff --git b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f
index 71fbf2b25..0f1d199fc 100644
--- b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@ def reset_simd(self, old_value, new_value, name):
if name == "vector_size" and new_value <= int(old_value):
# code can handle the new size -> do not recompile
return

# ok need to force recompilation of the cpp part
Sourcedir = pjoin(os.path.dirname(os.path.dirname(self.path)), 'Source')
subprocess.call(['make', 'cleanavx'], cwd=Sourcedir, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

Expand Down Expand Up @@ -103,7 +105,7 @@ def default_setup(self):
def write_one_include_file(self, output_dir, incname, output_file=None):
"""write one include file at the time"""
if incname == "vector.inc":
if 'vector_size' not in self.user_set: return
if 'vector_size' not in self.user_set and 'wrap_size' not in self.user_set: return
if output_file is None: vectorinc=pjoin(output_dir,incname)
else: vectorinc=output_file
with open(vectorinc+'.new','w') as fileout:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ namespace mg5amcCpu
* @param gs the pointer to the input Gs (running QCD coupling constant alphas)
* @param rndhel the pointer to the input random numbers for helicity selection
* @param rndcol the pointer to the input random numbers for color selection
* @param channelId the Feynman diagram to enhance in multi-channel mode if 1 to n (disable multi-channel if 0)
* @param channelIds the Feynman diagram to enhance in multi-channel mode if 1 to n
* @param mes the pointer to the output matrix elements
* @param goodHelOnly quit after computing good helicities?
* @param selhel the pointer to the output selected helicities
Expand All @@ -117,7 +117,7 @@ namespace mg5amcCpu
const FORTRANFPTYPE* gs,
const FORTRANFPTYPE* rndhel,
const FORTRANFPTYPE* rndcol,
const unsigned int channelId,
const unsigned int* channelIds,
FORTRANFPTYPE* mes,
int* selhel,
int* selcol,
Expand All @@ -130,7 +130,7 @@ namespace mg5amcCpu
* @param gs the pointer to the input Gs (running QCD coupling constant alphas)
* @param rndhel the pointer to the input random numbers for helicity selection
* @param rndcol the pointer to the input random numbers for color selection
* @param channelId the Feynman diagram to enhance in multi-channel mode if 1 to n (disable multi-channel if 0)
* @param channelIds the Feynman diagram to enhance in multi-channel mode if 1 to n
* @param mes the pointer to the output matrix elements
* @param selhel the pointer to the output selected helicities
* @param selcol the pointer to the output selected colors
Expand All @@ -140,7 +140,7 @@ namespace mg5amcCpu
const FORTRANFPTYPE* gs,
const FORTRANFPTYPE* rndhel,
const FORTRANFPTYPE* rndcol,
const unsigned int channelId,
const unsigned int* channelIds,
FORTRANFPTYPE* mes,
int* selhel,
int* selcol,
Expand Down Expand Up @@ -168,12 +168,14 @@ namespace mg5amcCpu
DeviceBufferMatrixElements m_devMEs;
DeviceBufferSelectedHelicity m_devSelHel;
DeviceBufferSelectedColor m_devSelCol;
DeviceBufferChannelIds m_devChanIds;
PinnedHostBufferGs m_hstGs;
PinnedHostBufferRndNumHelicity m_hstRndHel;
PinnedHostBufferRndNumColor m_hstRndCol;
PinnedHostBufferMatrixElements m_hstMEs;
PinnedHostBufferSelectedHelicity m_hstSelHel;
PinnedHostBufferSelectedColor m_hstSelCol;
PinnedHostBufferChannelIds m_hstChanIds;
std::unique_ptr<MatrixElementKernelDevice> m_pmek;
//static constexpr int s_gputhreadsmin = 16; // minimum number of gpu threads (TEST VALUE FOR MADEVENT)
static constexpr int s_gputhreadsmin = 32; // minimum number of gpu threads (DEFAULT)
Expand All @@ -185,6 +187,7 @@ namespace mg5amcCpu
HostBufferMatrixElements m_hstMEs;
HostBufferSelectedHelicity m_hstSelHel;
HostBufferSelectedColor m_hstSelCol;
HostBufferChannelIds m_hstChanIds;
std::unique_ptr<MatrixElementKernelHost> m_pmek;
#endif
};
Expand Down Expand Up @@ -227,6 +230,7 @@ namespace mg5amcCpu
, m_devMEs( m_nevt )
, m_devSelHel( m_nevt )
, m_devSelCol( m_nevt )
, m_devChanIds( m_nevt )
#else
, m_hstMomentaC( m_nevt )
#endif
Expand All @@ -236,11 +240,15 @@ namespace mg5amcCpu
, m_hstMEs( m_nevt )
, m_hstSelHel( m_nevt )
, m_hstSelCol( m_nevt )
, m_hstChanIds( m_nevt )
, m_pmek( nullptr )
{
if( nparF != CPPProcess::npar ) throw std::runtime_error( "Bridge constructor: npar mismatch" );
if( np4F != CPPProcess::np4 ) throw std::runtime_error( "Bridge constructor: np4 mismatch" );
#ifdef MGONGPUCPP_GPUIMPL
// this memory is allocated with cuda/hipMallocHost. The documentation does not guarantuee
// that its properly default initialized but we rely on this later on in sigmaKin
std::fill_n( m_hstChanIds.data(), m_nevt, 0 );
if( ( m_nevt < s_gputhreadsmin ) || ( m_nevt % s_gputhreadsmin != 0 ) )
throw std::runtime_error( "Bridge constructor: nevt should be a multiple of " + std::to_string( s_gputhreadsmin ) );
while( m_nevt != m_gpublocks * m_gputhreads )
Expand All @@ -252,10 +260,10 @@ namespace mg5amcCpu
}
std::cout << "WARNING! Instantiate device Bridge (nevt=" << m_nevt << ", gpublocks=" << m_gpublocks << ", gputhreads=" << m_gputhreads
<< ", gpublocks*gputhreads=" << m_gpublocks * m_gputhreads << ")" << std::endl;
m_pmek.reset( new MatrixElementKernelDevice( m_devMomentaC, m_devGs, m_devRndHel, m_devRndCol, m_devMEs, m_devSelHel, m_devSelCol, m_gpublocks, m_gputhreads ) );
m_pmek.reset( new MatrixElementKernelDevice( m_devMomentaC, m_devGs, m_devRndHel, m_devRndCol, m_devChanIds, m_devMEs, m_devSelHel, m_devSelCol, m_gpublocks, m_gputhreads ) );
#else
std::cout << "WARNING! Instantiate host Bridge (nevt=" << m_nevt << ")" << std::endl;
m_pmek.reset( new MatrixElementKernelHost( m_hstMomentaC, m_hstGs, m_hstRndHel, m_hstRndCol, m_hstMEs, m_hstSelHel, m_hstSelCol, m_nevt ) );
m_pmek.reset( new MatrixElementKernelHost( m_hstMomentaC, m_hstGs, m_hstRndHel, m_hstRndCol, m_hstChanIds, m_hstMEs, m_hstSelHel, m_hstSelCol, m_nevt ) );
#endif // MGONGPUCPP_GPUIMPL
// Create a process object, read param card and set parameters
// FIXME: the process instance can happily go out of scope because it is only needed to read parameters?
Expand Down Expand Up @@ -297,7 +305,7 @@ namespace mg5amcCpu
const FORTRANFPTYPE* gs,
const FORTRANFPTYPE* rndhel,
const FORTRANFPTYPE* rndcol,
const unsigned int channelId,
const unsigned int* channelIds,
FORTRANFPTYPE* mes,
int* selhel,
int* selcol,
Expand Down Expand Up @@ -327,6 +335,7 @@ namespace mg5amcCpu
std::copy( rndhel, rndhel + m_nevt, m_hstRndHel.data() );
std::copy( rndcol, rndcol + m_nevt, m_hstRndCol.data() );
}
if( channelIds ) memcpy( m_hstChanIds.data(), channelIds, m_nevt * sizeof( unsigned int ) );
copyDeviceFromHost( m_devGs, m_hstGs );
copyDeviceFromHost( m_devRndHel, m_hstRndHel );
copyDeviceFromHost( m_devRndCol, m_hstRndCol );
Expand All @@ -336,7 +345,8 @@ namespace mg5amcCpu
if( m_nGoodHel < 0 ) throw std::runtime_error( "Bridge gpu_sequence: computeGoodHelicities returned nGoodHel<0" );
}
if( goodHelOnly ) return;
m_pmek->computeMatrixElements( channelId );
copyDeviceFromHost( m_devChanIds, m_hstChanIds );
m_pmek->computeMatrixElements();
copyHostFromDevice( m_hstMEs, m_devMEs );
flagAbnormalMEs( m_hstMEs.data(), m_nevt );
copyHostFromDevice( m_hstSelHel, m_devSelHel );
Expand All @@ -362,7 +372,7 @@ namespace mg5amcCpu
const FORTRANFPTYPE* gs,
const FORTRANFPTYPE* rndhel,
const FORTRANFPTYPE* rndcol,
const unsigned int channelId,
const unsigned int* channelIds,
FORTRANFPTYPE* mes,
int* selhel,
int* selcol,
Expand All @@ -387,7 +397,8 @@ namespace mg5amcCpu
if( m_nGoodHel < 0 ) throw std::runtime_error( "Bridge cpu_sequence: computeGoodHelicities returned nGoodHel<0" );
}
if( goodHelOnly ) return;
m_pmek->computeMatrixElements( channelId );
if( channelIds ) memcpy( m_hstChanIds.data(), channelIds, m_nevt * sizeof( unsigned int ) );
m_pmek->computeMatrixElements();
flagAbnormalMEs( m_hstMEs.data(), m_nevt );
if constexpr( std::is_same_v<FORTRANFPTYPE, fptype> )
{
Expand Down
Loading