Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/31 outer #596

Merged
merged 33 commits into from
Jun 24, 2020
Merged

Features/31 outer #596

merged 33 commits into from
Jun 24, 2020

Conversation

ClaudiaComito
Copy link
Contributor

@ClaudiaComito ClaudiaComito commented Jun 15, 2020

Description

Introducing the outer product of two DNDarrays as per numpy.outer(), i.e.:

ht.outer(a, b)

a and b must be 1-dimensional, or will be flattened by default.
In linear algebra, if a is a vector of n elements, and b is a vector of m elements, the outer product outer(a,b) is a matrix of size (n, m) and elements:

$\begin{pmatrix}  a_0\cdot b_0  & a_0\cdot b_1 & . & . &  a_0\cdot b_m \  a_1\cdot b_0 & a_1\cdot b_1 & . & . & a_1\cdot b_m \  . & . & . & . & .   \   a_n\cdot b_0 & a_n\cdot b_1 & . & . & a_n\cdot b_m \end{pmatrix} $

Implementation

  • Locally, the outer product is calculated via PyTorch's Einstein summation, as in:
    torch.einsum("i,j->ij", a, b)

  • The memory-distributed implementation works under the assumption that the vectors / DNDarrays are dense (implementation for sparse case will be addressed at a later point (Implement sparse matrices  #384). Procedure:

    • a and b are distributed
    • one of the DNDarrays always stays local, the other is passed around the ranks in ring communication until all a_i * b_j multiplications have taken place.
    • outer product (torch.einsum) is calculated locally, written into the appropriate slice of out = ht.outer(a,b), and never needs to be communicated among ranks
    • out is by definition split along 0 if b is sent around and a stays put. Conversely, it is split along 1 if a is sent around and b stays put. As a consequence...
    • ... split semantics: I have introduced a kwarg split to allow users to choose the split direction of the output matrix. split decides which DNDarray stays and which one goes round the ranks. If split is None in the distributed case, ht.outer(a,b).split will be the same as a.split and b.split, i.e. 0.

Note that the distributed case requires that at least one among a.split, b.split and out.split be not None. Otherwise we fall back to local.

Issue/s resolved: #31

Changes proposed:

  • with respect to np.outer, ht.outer has an extra kwarg split to define the split dimension of the output matrix, e.g.: out = ht.outer(a, b, split=1)

Type of change

  • New feature (non-breaking change which adds functionality)

Due Diligence

  • All split configurations tested
  • Multiple dtypes tested in relevant functions
  • Documentation updated (if needed)
  • Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

no

@codecov
Copy link

codecov bot commented Jun 15, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.01%.
The diff coverage is 99.27%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   97.41%   97.43%   +0.01%     
==========================================
  Files          75       75              
  Lines       15030    15155     +125     
==========================================
+ Hits        14642    14766     +124     
- Misses        388      389       +1     
Impacted Files Coverage Δ
heat/core/linalg/basics.py 94.46% <98.78%> (+0.78%) ⬆️
heat/core/linalg/tests/test_basics.py 100.00% <100.00%> (ø)
heat/core/manipulations.py 99.51% <0.00%> (-0.01%) ⬇️
heat/core/tests/test_manipulations.py 99.89% <0.00%> (-0.01%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e3cb00...93a54a5. Read the comment docs.

# blocking send and recv
if split == 0:
b.comm.Send(t_b, dest_rank)
t_b = torch.empty(lshape_map[actual_origin], dtype=t_outer_dtype, device=t_a.device)
Copy link
Member

@Markus-Goetz Markus-Goetz Jun 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we initialize this buffer once outside the loop and then reuse it? Would save up some initialization time (and then slice it via the chunks below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'm initializing a buffer for the "traveling" data now, t_b_run and t_a_run, and slicing them to the correct size afterwards.

t_outer_slice[1] = remote_slice[0]
elif split == 1:
a.comm.Send(t_a, dest_rank)
t_a = torch.empty(lshape_map[actual_origin], dtype=t_outer_dtype, device=t_a.device)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see intialization above

Copy link
Member

@Markus-Goetz Markus-Goetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall very solid already. Please see the inline comments.

Copy link
Member

@Markus-Goetz Markus-Goetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misclick beforehand...

@ClaudiaComito
Copy link
Contributor Author

Misclick beforehand...

too bad...

Markus-Goetz
Markus-Goetz previously approved these changes Jun 24, 2020
@Markus-Goetz Markus-Goetz merged commit d045143 into master Jun 24, 2020
@Markus-Goetz Markus-Goetz deleted the features/31-outer branch June 24, 2020 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement outer()
2 participants