Large data counts support for MPI Communication #1765

JuanPedroGHM · 2025-01-22T16:24:19Z

Due Diligence

General:
- title of the PR is suitable to appear in the Release Notes
Implementation:
- unit tests: all split configurations tested
- unit tests: multiple dtypes tested
- benchmarks: created for new functionality
- benchmarks: performance improved or maintained
- documentation updated where needed

Description

Some MPI implementation are limited to sending only 2^31-1 elements at once. As far as I have tested, this also applies for OpenMPI 4.1 and 5.0, because support has not been added to mpi4py. (At least in my tests it failed).

This small changes uses the trick described here, to pack contiguous data into an MPI Vector, extending the limit of elements being sent.

This is for contiguous data, as non-contiguous data is already packed in recursive vector data types, reducing the need to apply this trick.

Issue/s resolved: #

Changes proposed:

MPI Vector to send more than 2^31-1 elements at once.

Type of change

Bug fix (non-breaking change which fixes an issue)

Does this change modify the behaviour of other functions? If so, which?

yes, probably a lot of them.

github-actions · 2025-01-22T16:31:28Z

Thank you for the PR!

codecov · 2025-01-22T17:06:49Z

Codecov Report

Attention: Patch coverage is 88.23529% with 2 lines in your changes missing coverage. Please review.

Project coverage is 92.25%. Comparing base (d66e404) to head (70f6432).

Files with missing lines	Patch %	Lines
heat/core/communication.py	88.23%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1765      +/-   ##
==========================================
- Coverage   92.26%   92.25%   -0.01%     
==========================================
  Files          84       84              
  Lines       12447    12463      +16     
==========================================
+ Hits        11484    11498      +14     
- Misses        963      965       +2

Flag	Coverage Δ
unit	`92.25% <88.23%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

github-actions · 2025-01-27T10:52:24Z

Thank you for the PR!

github-actions · 2025-01-27T10:52:30Z

Thank you for the PR!

mrfh92 · 2025-01-27T11:00:39Z

I have encountered the following problem:

import heat as ht 
import torch

shape = (2 ** 10, 2 ** 10, 2 ** 11)

data = torch.ones(shape, dtype=torch.float32) * ht.MPI_WORLD.rank
ht.MPI_WORLD.Allreduce(ht.MPI.IN_PLACE, data, ht.MPI.SUM)

results in the following error:

  File /heat/heat/core/communication.py", line 915, in Allreduce
    ret, sbuf, rbuf, buf = self.__reduce_like(self.handle.Allreduce, sendbuf, recvbuf, op)
  File "/heat/heat/core/communication.py", line 895, in __reduce_like
    return func(sendbuf, recvbuf, *args, **kwargs), sbuf, rbuf, buf
  File "src/mpi4py/MPI.src/Comm.pyx", line 1115, in mpi4py.MPI.Comm.Allreduce
mpi4py.MPI.Exception: MPI_ERR_OP: invalid reduce operation

With 2 ** 10 in the last entry of shape, there is not problem, so it seems to be related to large counts.

JuanPedroGHM · 2025-01-27T11:03:46Z

Benchmarks results - Sponsored by perun

function	mpi_ranks	device	metric	value	ref_value	std	% change	type	alert	lower_quantile	upper_quantile
lanczos	4	CPU	RUNTIME	0.403408	0.248962	0.000240251	62.0357	jump-detection	True	nan	nan
concatenate	4	CPU	RUNTIME	0.157287	0.195205	0.0200234	-19.4248	jump-detection	True	nan	nan
apply_inplace_standard_scaler_and_inverse	4	CPU	RUNTIME	0.0126308	0.00830153	0.000247896	52.1499	jump-detection	True	nan	nan
apply_inplace_min_max_scaler_and_inverse	4	CPU	RUNTIME	0.00267678	0.0010297	1.43647e-05	159.958	jump-detection	True	nan	nan
apply_inplace_max_abs_scaler_and_inverse	4	CPU	RUNTIME	0.00133836	0.000508517	1.43051e-06	163.189	jump-detection	True	nan	nan
qr_split_0	4	CPU	RUNTIME	0.227343	0.23604	0.00645348	-3.68473	trend-deviation	True	0.231439	0.240179
lanczos	4	CPU	RUNTIME	0.403408	0.246995	0.000240251	63.3261	trend-deviation	True	0.242826	0.255296
kmeans	4	CPU	RUNTIME	0.330444	0.311884	0.00292479	5.95114	trend-deviation	True	0.306719	0.318458
concatenate	4	CPU	RUNTIME	0.157287	0.175194	0.0200234	-10.2217	trend-deviation	True	0.159962	0.200042
apply_inplace_standard_scaler_and_inverse	4	CPU	RUNTIME	0.0126308	0.0074345	0.000247896	69.8939	trend-deviation	True	0.00688219	0.00816596
apply_inplace_min_max_scaler_and_inverse	4	CPU	RUNTIME	0.00267678	0.00106236	1.43647e-05	151.965	trend-deviation	True	0.0010253	0.00117438
apply_inplace_max_abs_scaler_and_inverse	4	CPU	RUNTIME	0.00133836	0.000567622	1.43051e-06	135.784	trend-deviation	True	0.000510837	0.000658398

Grafana Dashboard
Last updated: 2025-02-03T14:44:41Z

mrfh92 · 2025-01-27T15:59:32Z

Could there be the problem that for all communication involving MPI-Operations like MPI.SUM etc. such an operation is not well-defined on the MPI-Vector construction chosen for the buffers?

github-actions · 2025-01-27T16:11:41Z

Thank you for the PR!

JuanPedroGHM · 2025-01-28T09:23:36Z

Could there be the problem that for all communication involving MPI-Operations like MPI.SUM etc. such an operation is not well-defined on the MPI-Vector construction chosen for the buffers?

Have you found some bug? I don't think it should be an issue, as the vector datatype is just pointing to where the data is, where it needs to go, and it in what order. As long as both send and recv buffers are well-defined by the datatype, there should not be an issue with MPI operations.

mrfh92 · 2025-01-28T12:09:37Z

The example with Allreduce I posted above caused an error for me.

…ta types

trick to send large data

ed843a4

github-actions bot added backport stable bug Something isn't working core labels Jan 22, 2025

JuanPedroGHM added benchmark PR PR talk labels Jan 22, 2025

Hoppe and others added 2 commits January 27, 2025 11:46

added tests

738c634

Merge branch 'main' into fix/mpi_int_limit_trick

9234df9

github-actions bot added the testing Implementation of tests, or test-related issues label Jan 27, 2025

Merge branch 'main' into fix/mpi_int_limit_trick

70f6432

Fixes for allreduce

23c5de4

github-actions bot added the backport release label Feb 3, 2025

fixed large counts for allreduce, now trying to fix non-contiguous da…

15eeedb

…ta types

mrfh92 mentioned this pull request Feb 3, 2025

Perform testing with a large-count MPI-implementation #1737

Closed

Custom operations for allreduce

784a850

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large data counts support for MPI Communication #1765

Large data counts support for MPI Communication #1765

JuanPedroGHM commented Jan 22, 2025 •

edited

Loading

github-actions bot commented Jan 22, 2025

codecov bot commented Jan 22, 2025 •

edited

Loading

github-actions bot commented Jan 27, 2025

github-actions bot commented Jan 27, 2025

mrfh92 commented Jan 27, 2025

JuanPedroGHM commented Jan 27, 2025 •

edited

Loading

mrfh92 commented Jan 27, 2025

github-actions bot commented Jan 27, 2025

JuanPedroGHM commented Jan 28, 2025

mrfh92 commented Jan 28, 2025

Large data counts support for MPI Communication #1765

Are you sure you want to change the base?

Large data counts support for MPI Communication #1765

Conversation

JuanPedroGHM commented Jan 22, 2025 • edited Loading

Due Diligence

Description

Changes proposed:

Type of change

Does this change modify the behaviour of other functions? If so, which?

github-actions bot commented Jan 22, 2025

codecov bot commented Jan 22, 2025 • edited Loading

Codecov Report

github-actions bot commented Jan 27, 2025

github-actions bot commented Jan 27, 2025

mrfh92 commented Jan 27, 2025

JuanPedroGHM commented Jan 27, 2025 • edited Loading

Benchmarks results - Sponsored by perun

mrfh92 commented Jan 27, 2025

github-actions bot commented Jan 27, 2025

JuanPedroGHM commented Jan 28, 2025

mrfh92 commented Jan 28, 2025

JuanPedroGHM commented Jan 22, 2025 •

edited

Loading

codecov bot commented Jan 22, 2025 •

edited

Loading

JuanPedroGHM commented Jan 27, 2025 •

edited

Loading