GPU improved launch parameters, memory usage, etc. #1100

markusbattarbee · 2025-02-17T14:48:10Z

Differentiates between Ukko DGX (V100 nodes) and A100 nodes. Updates to up-to-date modules. DGX has much better queue times, but performance is still a bit too slow for CI (about 1h 20 mins). A100 nodes suffer, in addition to terrible queue times, from terrible core/thread/GPU placement, awaiting fixes from IT4sci.

GPU initial memory buffers were changed to scale with WID value so WID=8 binaries don't hog up crazy amounts of memory. Also fixed a mis-indexing in GPU memory reporting. Some tests still report more GPU memory use than expected, pending investigation.

Cleans away some broken timers from GPU branch.

Improves readability of batch operation launch parameters (use of std::max)

Improves launch parameters for GPU kernels, especially when not using warp accessors (which is the default).

…n block adjustment.

…ooks like a memory leak.

…shinator fixes.

…ko_dgx_CI

…lation tests work again

…ko_dgx_CI_launcparams

…ll a bit too for CI, A100 placements remain broken.

markusbattarbee · 2025-02-17T15:00:35Z

Now it would be possible to go back to a single ukko CUDA makefile, compiling and linking the binary to support both V100 and A100 cards (architectures 70 and 80) but compilation will take longer, and compiling for an older architecture should be safe. Thoughts?

ykempf

If it otherwise works... But apart from this one comment I have not looked in detail.

spatial_cells/velocity_block_container.h

ykempf · 2025-02-19T10:17:46Z

And here too the output file is truncated mid-Ionos

… files and runs.

… now

markusbattarbee added 11 commits February 6, 2025 21:34

Better launch parameters and adjustments to GPU kernels, especially i…

5f8974e

…n block adjustment.

interim commit - re-working memory to get WID8 TP to fit in memory. L…

84f0b1b

…ooks like a memory leak.

Cleanup. Some unreported memory usage still, but much better after ha…

550bd24

…shinator fixes.

Rename ukko gpu files to a100

de251c7

Ukko A100 and DGX differentiation

59b4f05

Merge branch 'gpu_improve_block_adjustment_launch_parameters' into uk…

f3ba36c

…ko_dgx_CI

Cleanup, timer removals, and something to make the single-block trans…

e231482

…lation tests work again

purge blocksPerGpu serial looping after all

93d9a7d

Merge branch 'gpu_improve_block_adjustment_launch_parameters' into uk…

9e30843

…ko_dgx_CI_launcparams

reinstate gpu shrink to fit, fixes to ukko launch scripts. DGX is sti…

d3a577f

…ll a bit too for CI, A100 placements remain broken.

Fix typo in vlasiator_arch ukko dgx CI

3362c58

markusbattarbee added improved-memory-usage gpu profiling CI Continuous Integration labels Feb 17, 2025

markusbattarbee added 3 commits February 17, 2025 17:03

Fix mistake in CPU side looping for setNewCapacityShrink

4304c22

Fix apparent error in VBC reallocation calculations

e5e8ba1

Finally fix setNewCapacityShrink. Also fix mistake in sizeInBytes.

51d5628

markusbattarbee requested a review from ykempf February 18, 2025 19:54

ykempf approved these changes Feb 19, 2025

View reviewed changes

spatial_cells/velocity_block_container.h Outdated Show resolved Hide resolved

markusbattarbee added 8 commits February 19, 2025 14:20

Reduce output from CI to fit 64 kB limit. Some cleanup to testpackage…

2a7135d

… files and runs.

Merge branch 'dev' into ukko_dgx_CI_launcparams

f3f08b3

Fix mistake in merge conflict resolution

7931cd5

Adjust awk exlusion syntax

0ed6401

Adjust awk calling syntax

84ae42d

Try switching to just grep instead of awk

2e5b132

Attempt different approach for vlsvdiff in CI bash script

f2959aa

vlsvdiff output via variable, should also work for timestamp checking…

596a3b8

… now

markusbattarbee added 4 commits February 19, 2025 16:46

Only evalute timestamp difference from vg output files

e6d29f8

Squelch warnings about uninitialized variables in vlsvextract.cpp

26036cf

Merge branch 'dev' into ukko_dgx_CI_launcparams

bc8f015

Ensure up-to-date dccrg version

6c780b7

markusbattarbee mentioned this pull request Feb 20, 2025

Try building will all debug flags activated in CI to check the debug … #1087

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU improved launch parameters, memory usage, etc. #1100

GPU improved launch parameters, memory usage, etc. #1100

markusbattarbee commented Feb 17, 2025

markusbattarbee commented Feb 17, 2025

ykempf left a comment

ykempf commented Feb 19, 2025

GPU improved launch parameters, memory usage, etc. #1100

Are you sure you want to change the base?

GPU improved launch parameters, memory usage, etc. #1100

Conversation

markusbattarbee commented Feb 17, 2025

markusbattarbee commented Feb 17, 2025

ykempf left a comment

Choose a reason for hiding this comment

ykempf commented Feb 19, 2025