Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Alpaka develop (pre-0.4.0) #2807

Conversation

sbastrakov
Copy link
Member

@sbastrakov sbastrakov commented Nov 14, 2018

As discussed with @ax3l , replaced alpaka 0.3.4 with the current alpaka develop branch and made the necessary changes in the picongpu code.

Alpaka changed naming of streams to queues. I've changed variable and type names accordingly, but for now kept names in cupla/types.hpp like AccHostStream, as they seemed more global to me.

So far only tested that the standard LWFA compiles on CPU and GPU and runs on GPU, so WIP.

Alpaka Pre-0.4.0 Version

Update Alpaka from

Used Update Command

GIT_AUTHOR_NAME="Third Party" GIT_AUTHOR_EMAIL="picongpu@hzdr.de" \
  git subtree pull --prefix thirdParty/alpaka \
  https://github.com/ComputationalRadiationPhysics/alpaka.git develop --squash

Third Party and others added 3 commits November 14, 2018 17:17
@ax3l ax3l added the component: third party third party libraries that are shipped and/or linked label Nov 14, 2018
@ax3l ax3l changed the title [WIP] Alpaka develop [WIP] Alpaka develop (pre-0.4.0) Nov 14, 2018
@ax3l ax3l added backend: cuda CUDA backend backend: omp2b OpenMP2 backend labels Nov 14, 2018
@ax3l
Copy link
Member

ax3l commented Nov 14, 2018

@sbastrakov uh dang, I just saw some updates need to go also into https://github.com/ComputationalRadiationPhysics/cupla

@sbastrakov
Copy link
Member Author

It's actually almost exclusively cupla changes, with only a minor change to pmacc. Sorry, should have pointed that out myself. So what is the course of action then, after I try other examples and figure out the failing tests?

@ax3l
Copy link
Member

ax3l commented Nov 14, 2018 via email

@sbastrakov
Copy link
Member Author

sbastrakov commented Nov 15, 2018

I guess I'm blind, but from the failing test log I actually don't see what exactly goes wrong: for all the failing examples after 45% there are some warnings and then the log stops. What am I missing here?

On hypnos all standard examples compile fine.

@ax3l
Copy link
Member

ax3l commented Nov 15, 2018 via email

@sbastrakov
Copy link
Member Author

sbastrakov commented Nov 16, 2018

Now the situation looks weird to me. I've tried your command (with the destination directory added to the end) on hypnos and everything but both configurations of FoilLCT builds there (while on the testing system all examples seem to fail). However, the error message looks cropped (strangely reminiscent of the testing system) and so not really informative to both me and @psychocoderHPC . He suggested the error might be somehow related to overly long names. He also suggested I might try to remove some particle species from this example and see if it starts compiling. The same example however builds fine on Hemera.

@psychocoderHPC
Copy link
Member

Offline discussed: @sbastrakov will also try to compile it on Hemera with CUDA 9.2

@sbastrakov
Copy link
Member Author

@psychocoderHPC I already did and it compiles fine. Now trying your another idea to remove some of existing particle species.

@ax3l
Copy link
Member

ax3l commented Nov 19, 2018

Just as a note, these are the software dependencies loaded on the compile suite:

module load gcc/4.9.4 boost/1.62.0 cmake/3.10.0 cuda/8.0.44 openmpi/1.10.4
module load libSplash/1.7.0 adios/1.13.1
module load pngwriter/0.7.0
module load libjpeg-turbo/1.5.1 icet/2.1.1 jansson/2.9 isaac/1.4.0

(Intentionally using the oldest supported versions of things.)

@ax3l ax3l added this to the 0.5.0 / 1.0.0: Next Stable milestone Nov 19, 2018
@sbastrakov
Copy link
Member Author

@ax3l thanks for info. Not related to the bug in question, but for future use: is there any easy way to see that versions from the logs alone? If not, maybe worth adding.

@ax3l
Copy link
Member

ax3l commented Nov 19, 2018

Currently documented here and here but usually visible in the CMake output, which I crop intentionally away.
Nevertheless, the current proxy is a (long-term) hack until we have our in-house CI ready: [1] [2]

@sbastrakov
Copy link
Member Author

sbastrakov commented Nov 19, 2018

Since I'm again stuck here is a summary of the current status with changes of this PR:

  • on the testing system all examples fail to build, however error messages are cropped
  • on hemera GPU all examples build fine
  • on hypnos k80 all examples except FoilLCT build fine

The problem in FoilLCT on hypnos k80 is cased by Thomas-Fermi ionization, build log. Modifying the example to not use this ionizer makes it build. Removing some particle species while keeping Thomas-Fermi does not help. @psychocoderHPC suggested it might be some kind of cuda 8 bug (offline discussion). However, to me it does not really explain the difference between the testing system and hypnos k80.

@sbastrakov
Copy link
Member Author

A little follow-up: with hypnos laser profile FoilLCT also builds fine.

@ax3l
Copy link
Member

ax3l commented Nov 19, 2018

So many new warnings in Boost/Alpaka... I increased the limit of reported lines now. (I crop them in the compile suite reports since the data base storing them gets huge over time.)

Just push here again to trigger a new build and let's hope another 90 lines of warnings are enough to see the first error.

@sbastrakov
Copy link
Member Author

@ax3l I've pushed yesterday evening and the testing is still pending. Wondering if everything is alright with the testing system.

@ax3l
Copy link
Member

ax3l commented Nov 20, 2018

Maybe of the involved hooks failed or one of the participating services was briefly offline. Pls push again.

@ax3l
Copy link
Member

ax3l commented Nov 20, 2018

The error seems to be too large to be reported to the proxy. Here it is, carved out manually:

In file included from tmpxft_00000f57_00000000-4_main.cudafe1.stub.c:1:0:
/tmp/tmpxft_00000f57_00000000-4_main.cudafe1.stub.c: In function 
void __device_stub__ZN6alpaka4exec4cuda6detail10cudaKernelISt17integral_constantImLm3EEjN5cupla11CuplaKernelIN8picongpu21KernelEnergyParticlesILj64EEEEEJN5pmacc12ParticlesBoxINSC_5FrameINSC_15ParticlesBufferINSC_19ParticleDescriptionINSC_11compileTime6StringIJLc101ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENSC_4math2CT6VectorIN4mpl_10integral_cIiLi8EEESP_NSN_2naEEEN5boost3mpl6v_itemINS8_24placeholder_definition2913momentumPrev1ENSU_INS8_24placeholder_definition2610particleIdENSU_INS8_24placeholder_definition309weightingENSU_INS8_24placeholder_definition288momentumENSU_INS8_24placeholder_definition258positionINS8_24placeholder_definition2712position_picENSC_24placeholder_definition2213pmacc_isAliasEEENST_7vector0ISQ_EELi0EEELi0EEELi0EEELi0EEELi0EEENSU_INS8_24placeholder_definition4421bremsstrahlungPhotonsINS8_9ParticlesINSI_IJLc112ELc104ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENSU_INS8_24placeholder_definition5111chargeRatioINS8_25placeholder_definition12718ChargeRatioPhotonsES18_EENSU_INS8_24placeholder_definition509massRatioINS8_25placeholder_definition12616MassRatioPhotonsES18_EENSU_INS8_24placeholder_definition4513interpolationINS8_28FieldToParticleInterpolationINS8_9particles6shapes3TSCENS8_30AssignedTrilinearInterpolationEEES18_EENSU_INS8_24placeholder_definition385shapeIS20_S18_EENSU_INS8_24placeholder_definition3914particlePusherINS1Y_6pusher6PhotonES18_EES1B_Li0EEELi0EEELi0EEELi0EEELi0EEES1G_EES18_EENSU_INS8_24placeholder_definition4318bremsstrahlungIonsINS1J_INSI_IJLc105ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENSU_INS8_24placeholder_definition4713atomicNumbersINS8_10ionization13atomicNumbers6Gold_tES18_EENSU_INS8_24placeholder_definition5212densityRatioINS8_25placeholder_definition13016DensityRatioIonsES18_EENSU_INS1M_INS8_25placeholder_definition12915ChargeRatioIonsES18_EENSU_INS1R_INS8_25placeholder_definition12813MassRatioIonsES18_EENSU_INS8_24placeholder_definition467currentINS8_13currentSolver9EsirkepovIS20_Lj2EEES18_EENSU_IS23_NSU_IS26_NSU_INS28_INS29_5BorisES18_EES1B_Li0EEELi0EEELi0EEELi0EEELi0EEELi0EEELi0EEELi0EEES1G_EES18_EENSU_INS2T_INS8_25placeholder_definition13321DensityRatioElectronsES18_EENSU_INS1M_INS8_25placeholder_definition13220ChargeRatioElectronsES18_EENSU_INS1R_INS8_25placeholder_definition13118MassRatioElectronsES18_EES3E_Li0EEELi0EEELi0EEELi0EEELi0EEENSC_17HandleGuardRegionINSC_9particles8policies17ExchangeParticlesENS1Y_8boundary29CallPluginsAndDeleteParticlesEEES1B_S1B_EESR_N8mallocMC9AllocatorINS47_16CreationPolicies7ScatterINS8_16DeviceHeapConfigENS49_11ScatterConf27DefaultScatterHashingParamsEEENS47_20DistributionPolicies4NoopENS47_11OOMPolicies10ReturnNullENS47_19ReservePoolPolicies16SimpleCudaMallocENS47_17AlignmentPolicies6ShrinkINS4L_12ShrinkConfig19DefaultShrinkConfigEEEEELj2EE29OperatorCreatePairStaticArrayILj64EEENSG_ISJ_SR_NSU_INSC_24placeholder_definition249multiMaskENSU_INSC_24placeholder_definition2312localCellIdxES1G_Li0EEELi0EEES3Y_S45_S1B_NSU_INSC_12NextFramePtrINSN_3argILi1EEEEENSU_INSC_16PreviousFramePtrIS52_EES1B_Li0EEELi0EEEEEEENS47_19AllocatorHandleImplIS4Q_EELj2EEENSC_7DataBoxINSC_10PitchedBoxIdLj1EEEEENSC_11AreaMappingILj3ENSC_18MappingDescriptionILj2ESR_EEEENSC_7functor9InterfaceINS1Y_6filter3AllELj1EbEEEEEvNS_3vec3VecIT_T0_EET1_DpT2_(const _ZN6alpaka3vec3VecISt17integral_constantImLm3EEjEE&, const _ZN5cupla11CuplaKernelIN8picongpu21KernelEnergyParticlesILj64EEEEE&, _ZN5pmacc13ParticlesBaseINS_19ParticleDescriptionINS_11compileTime6StringIJLc101ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENS_4math2CT6VectorIN4mpl_10integral_cIiLi8EEESA_NS8_2naEEEN5boost3mpl6v_itemIN8picongpu24placeholder_definition2913momentumPrev1ENSF_INSG_24placeholder_definition2610particleIdENSF_INSG_24placeholder_definition309weightingENSF_INSG_24placeholder_definition288momentumENSF_INSG_24placeholder_definition258positionINSG_24placeholder_definition2712position_picENS_24placeholder_definition2213pmacc_isAliasEEENSE_7vector0ISB_EELi0EEELi0EEELi0EEELi0EEELi0EEENSF_INSG_24placeholder_definition4421bremsstrahlungPhotonsINSG_9ParticlesINS3_IJLc112ELc104ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENSF_INSG_24placeholder_definition5111chargeRatioINSG_25placeholder_definition12718ChargeRatioPhotonsESU_EENSF_INSG_24placeholder_definition509massRatioINSG_25placeholder_definition12616MassRatioPhotonsESU_EENSF_INSG_24placeholder_definition4513interpolationINSG_28FieldToParticleInterpolationINSG_9particles6shapes3TSCENSG_30AssignedTrilinearInterpolationEEESU_EENSF_INSG_24placeholder_definition385shapeIS1M_SU_EENSF_INSG_24placeholder_definition3914particlePusherINS1K_6pusher6PhotonESU_EESX_Li0EEELi0EEELi0EEELi0EEELi0EEES12_EESU_EENSF_INSG_24placeholder_definition4318bremsstrahlungIonsINS15_INS3_IJLc105ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENSF_INSG_24placeholder_definition4713atomicNumbersINSG_10ionization13atomicNumbers6Gold_tESU_EENSF_INSG_24placeholder_definition5212densityRatioINSG_25placeholder_definition13016DensityRatioIonsESU_EENSF_INS18_INSG_25placeholder_definition12915ChargeRatioIonsESU_EENSF_INS1D_INSG_25placeholder_definition12813MassRatioIonsESU_EENSF_INSG_24placeholder_definition467currentINSG_13currentSolver9EsirkepovIS1M_Lj2EEESU_EENSF_IS1P_NSF_IS1S_NSF_INS1U_INS1V_5BorisESU_EESX_Li0EEELi0EEELi0EEELi0EEELi0EEELi0EEELi0EEELi0EEES12_EESU_EENSF_INS2F_INSG_25placeholder_definition13321DensityRatioElectronsESU_EENSF_INS18_INSG_25placeholder_definition13220ChargeRatioElectronsESU_EENSF_INS1D_INSG_25placeholder_definition13118MassRatioElectronsESU_EES30_Li0EEELi0EEELi0EEELi0EEELi0EEENS_17HandleGuardRegionINS_9particles8policies17ExchangeParticlesENS1K_8boundary29CallPluginsAndDeleteParticlesEEESX_SX_EENS_18MappingDescriptionILj2ESC_EEN8mallocMC9AllocatorINS3V_16CreationPolicies7ScatterINSG_16DeviceHeapConfigENS3X_11ScatterConf27DefaultScatterHashingParamsEEENS3V_20DistributionPolicies4NoopENS3V_11OOMPolicies10ReturnNullENS3V_19ReservePoolPolicies16SimpleCudaMallocENS3V_17AlignmentPolicies6ShrinkINS49_12ShrinkConfig19DefaultShrinkConfigEEEEEE16ParticlesBoxTypeE&, _ZN5pmacc7DataBoxINS_10PitchedBoxIdLj1EEEEE&, _ZN5pmacc11AreaMappingILj3ENS_18MappingDescriptionILj2ENS_4math2CT6VectorIN4mpl_10integral_cIiLi8EEES7_NS5_2naEEEEEEE&, _ZNSt3_MuISt12_PlaceholderILi1EELb0ELb1EE6resultIFS2_S1_St5tupleIJN5pmacc7functor9InterfaceIN8picongpu9particles6filter3AllELj1EbEEEEEE11__base_typeE&):
/tmp/tmpxft_00000f57_00000000-4_main.cudafe1.stub.c:1259:8284: error: invalid application of 
sizeof
 to incomplete type 
_ZNSt3_MuISt12_PlaceholderILi1EELb0ELb1EE6resultIFS2_S1_St5tupleIJN5pmacc7functor9InterfaceIN8picongpu9particles6filter3AllELj1EbEEEEEE11__base_typeE {aka _ZN5pmacc7functor9InterfaceIN8picongpu9particles6filter3AllELj1EbEE}
static void __device_stub__ZN6alpaka4exec4cuda6detail10cudaKernelISt17integral_constantImLm3EEjN5cupla11CuplaKernelIN8picongpu21KernelEnergyParticles ...

last.multiline.log.gz

Did you try building with CUDA 8 on Hypnos?

@ax3l
Copy link
Member

ax3l commented Nov 20, 2018

cc @BenjaminW3 just in case you haven't seen: we are struggling a little as it looks with CUDA/NVCC 8. @sbastrakov is currently on it.

@ax3l
Copy link
Member

ax3l commented Nov 22, 2018

@sbastrakov please leave this CUDA 8.0 issue aside so far, @psychocoderHPC wants to take care of this later on and it's not super urgent. We might switch away for CUDA/NVCC 9.0+ (C++14) beginning next year.

Please open your PR with the new boundary conditions, that is of high priority.

@psychocoderHPC
Copy link
Member

#2958 is the updated version of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: cuda CUDA backend backend: omp2b OpenMP2 backend component: third party third party libraries that are shipped and/or linked
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants