-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Features/31 outer #596
Features/31 outer #596
Conversation
Codecov Report
@@ Coverage Diff @@
## master #596 +/- ##
==========================================
+ Coverage 97.41% 97.43% +0.01%
==========================================
Files 75 75
Lines 15030 15155 +125
==========================================
+ Hits 14642 14766 +124
- Misses 388 389 +1
Continue to review full report at Codecov.
|
heat/core/linalg/basics.py
Outdated
# blocking send and recv | ||
if split == 0: | ||
b.comm.Send(t_b, dest_rank) | ||
t_b = torch.empty(lshape_map[actual_origin], dtype=t_outer_dtype, device=t_a.device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we initialize this buffer once outside the loop and then reuse it? Would save up some initialization time (and then slice it via the chunks below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I'm initializing a buffer for the "traveling" data now, t_b_run
and t_a_run
, and slicing them to the correct size afterwards.
heat/core/linalg/basics.py
Outdated
t_outer_slice[1] = remote_slice[0] | ||
elif split == 1: | ||
a.comm.Send(t_a, dest_rank) | ||
t_a = torch.empty(lshape_map[actual_origin], dtype=t_outer_dtype, device=t_a.device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see intialization above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall very solid already. Please see the inline comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misclick beforehand...
too bad... |
Description
Introducing the outer product of two DNDarrays as per numpy.outer(), i.e.:
ht.outer(a, b)
a
andb
must be 1-dimensional, or will be flattened by default.In linear algebra, if
a
is a vector of n elements, andb
is a vector of m elements, the outer productouter(a,b)
is a matrix of size (n, m) and elements:Implementation
Locally, the outer product is calculated via PyTorch's Einstein summation, as in:
torch.einsum("i,j->ij", a, b)
The memory-distributed implementation works under the assumption that the vectors / DNDarrays are dense (implementation for sparse case will be addressed at a later point (Implement sparse matrices #384). Procedure:
a
andb
are distributeda_i * b_j
multiplications have taken place.torch.einsum
) is calculated locally, written into the appropriate slice ofout = ht.outer(a,b)
, and never needs to be communicated among ranksout
is by definition split along 0 ifb
is sent around anda
stays put. Conversely, it is split along 1 ifa
is sent around andb
stays put. As a consequence...split
to allow users to choose the split direction of the output matrix.split
decides which DNDarray stays and which one goes round the ranks. If split is None in the distributed case,ht.outer(a,b).split
will be the same asa.split
andb.split
, i.e. 0.Note that the distributed case requires that at least one among
a.split
,b.split
andout.split
benot None
. Otherwise we fall back to local.Issue/s resolved: #31
Changes proposed:
split
to define the split dimension of the output matrix, e.g.:out = ht.outer(a, b, split=1)
Type of change
Due Diligence
Does this change modify the behaviour of other functions? If so, which?
no