Features/31 outer #596

ClaudiaComito · 2020-06-15T10:16:45Z

Description

Introducing the outer product of two DNDarrays as per numpy.outer(), i.e.:

ht.outer(a, b)

a and b must be 1-dimensional, or will be flattened by default.
In linear algebra, if a is a vector of n elements, and b is a vector of m elements, the outer product outer(a,b) is a matrix of size (n, m) and elements:

$\begin{pmatrix} a_0\cdot b_0 & a_0\cdot b_1 & . & . & a_0\cdot b_m \ a_1\cdot b_0 & a_1\cdot b_1 & . & . & a_1\cdot b_m \ . & . & . & . & . \ a_n\cdot b_0 & a_n\cdot b_1 & . & . & a_n\cdot b_m \end{pmatrix}$

Implementation

Locally, the outer product is calculated via PyTorch's Einstein summation, as in:
torch.einsum("i,j->ij", a, b)
The memory-distributed implementation works under the assumption that the vectors / DNDarrays are dense (implementation for sparse case will be addressed at a later point (Implement sparse matrices #384). Procedure:
- a and b are distributed
- one of the DNDarrays always stays local, the other is passed around the ranks in ring communication until all a_i * b_j multiplications have taken place.
- outer product (torch.einsum) is calculated locally, written into the appropriate slice of out = ht.outer(a,b), and never needs to be communicated among ranks
- out is by definition split along 0 if b is sent around and a stays put. Conversely, it is split along 1 if a is sent around and b stays put. As a consequence...
- ... split semantics: I have introduced a kwarg split to allow users to choose the split direction of the output matrix. split decides which DNDarray stays and which one goes round the ranks. If split is None in the distributed case, ht.outer(a,b).split will be the same as a.split and b.split, i.e. 0.

Note that the distributed case requires that at least one among a.split, b.split and out.split be not None. Otherwise we fall back to local.

Issue/s resolved: #31

Changes proposed:

with respect to np.outer, ht.outer has an extra kwarg split to define the split dimension of the output matrix, e.g.: out = ht.outer(a, b, split=1)

Type of change

New feature (non-breaking change which adds functionality)

Due Diligence

All split configurations tested
Multiple dtypes tested in relevant functions
Documentation updated (if needed)
Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

no

codecov · 2020-06-15T10:24:16Z

Codecov Report

Merging #596 into master will increase coverage by 0.01%.
The diff coverage is 99.27%.

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   97.41%   97.43%   +0.01%     
==========================================
  Files          75       75              
  Lines       15030    15155     +125     
==========================================
+ Hits        14642    14766     +124     
- Misses        388      389       +1

Impacted Files	Coverage Δ
heat/core/linalg/basics.py	`94.46% <98.78%> (+0.78%)`	⬆️
heat/core/linalg/tests/test_basics.py	`100.00% <100.00%> (ø)`
heat/core/manipulations.py	`99.51% <0.00%> (-0.01%)`	⬇️
heat/core/tests/test_manipulations.py	`99.89% <0.00%> (-0.01%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e3cb00...93a54a5. Read the comment docs.

…at into features/31-outer

heat/core/linalg/basics.py

Markus-Goetz · 2020-06-17T12:50:23Z

heat/core/linalg/basics.py

+            # blocking send and recv
+            if split == 0:
+                b.comm.Send(t_b, dest_rank)
+                t_b = torch.empty(lshape_map[actual_origin], dtype=t_outer_dtype, device=t_a.device)


Can we initialize this buffer once outside the loop and then reuse it? Would save up some initialization time (and then slice it via the chunks below)

That's a good point. I'm initializing a buffer for the "traveling" data now, t_b_run and t_a_run, and slicing them to the correct size afterwards.

Markus-Goetz · 2020-06-17T12:51:05Z

heat/core/linalg/basics.py

+                t_outer_slice[1] = remote_slice[0]
+            elif split == 1:
+                a.comm.Send(t_a, dest_rank)
+                t_a = torch.empty(lshape_map[actual_origin], dtype=t_outer_dtype, device=t_a.device)


see intialization above

Markus-Goetz

Overall very solid already. Please see the inline comments.

Markus-Goetz

Misclick beforehand...

ClaudiaComito · 2020-06-18T09:01:08Z

Misclick beforehand...

too bad...

ClaudiaComito added 22 commits June 5, 2020 12:32

implementing outer(): first pass, local outer product

8b697c1

Broken: outer() distributed implementation, out.split=0

2f678e5

distributed outer() for out.split = 0

1512dec

if outer_split is None, pick smallest tensor for ring communication

da83730

outer(), distributed, out.split = 1

478f696

reorganized and added a few conditions to address special cases

14f4a46

Input and output sanitation

4af07ea

Modified some variable names for consistency

a686cec

Docs update

35fa323

Fixed type promotion problem, set default split to None

652886f

Implemented test_outer()

e04e8e6

removed print statements

52818df

Testing more exceptions

7ea6dde

Updating changelog

f9e1c14

Merge branch 'master' into features/31-outer

5d651b5

Added examples

8e37e5c

Reviewed split semantics of returned DNDarray, reviewed docs, tests.

85a4eea

Reviewed tests

03f3a60

Default flattening if DNDarrays are larger than 1D, added tests.

3921ede

Docs update

44737a9

updated changelog

cfa74ab

Updated docs

13e6e59

ClaudiaComito requested review from Markus-Goetz and coquelin77 June 15, 2020 10:18

ClaudiaComito requested review from Cdebus and mtar June 15, 2020 10:23

ClaudiaComito added 2 commits June 15, 2020 13:07

Merge branch 'master' into features/31-outer

ac75c7b

updated changelog

7913c0a

ClaudiaComito and others added 4 commits June 17, 2020 06:12

Merge branch 'master' into features/31-outer

52fd98e

Merge branch 'features/31-outer' of github.com:helmholtz-analytics/he…

00c758d

…at into features/31-outer

Docs update

ce30d9f

Docs formatting

e6b604a

Markus-Goetz reviewed Jun 17, 2020

View reviewed changes

heat/core/linalg/basics.py Outdated Show resolved Hide resolved

Markus-Goetz reviewed Jun 17, 2020

View reviewed changes

heat/core/linalg/basics.py Outdated Show resolved Hide resolved

Markus-Goetz reviewed Jun 17, 2020

View reviewed changes

heat/core/linalg/basics.py Outdated Show resolved Hide resolved

Markus-Goetz reviewed Jun 17, 2020

View reviewed changes

ClaudiaComito added 4 commits June 18, 2020 07:41

Docs formatting

90d4076

More docs formatting

24c21bb

Buffer initialization for traveling data outside of ranks loop

ef2d11d

Typos

e15a87e

Markus-Goetz approved these changes Jun 18, 2020

View reviewed changes

Markus-Goetz requested changes Jun 18, 2020

View reviewed changes

Markus-Goetz previously approved these changes Jun 24, 2020

View reviewed changes

Merge branch 'master' into features/31-outer

93a54a5

Markus-Goetz dismissed their stale review via 93a54a5 June 24, 2020 09:26

Markus-Goetz merged commit d045143 into master Jun 24, 2020

Markus-Goetz deleted the features/31-outer branch June 24, 2020 09:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features/31 outer #596

Features/31 outer #596

ClaudiaComito commented Jun 15, 2020 •

edited

Loading

codecov bot commented Jun 15, 2020 •

edited

Loading

Markus-Goetz Jun 17, 2020 •

edited

Loading

ClaudiaComito Jun 18, 2020

Markus-Goetz Jun 17, 2020

Markus-Goetz left a comment

Markus-Goetz left a comment

ClaudiaComito commented Jun 18, 2020

Features/31 outer #596

Features/31 outer #596

Conversation

ClaudiaComito commented Jun 15, 2020 • edited Loading

Description

Implementation

Changes proposed:

Type of change

Due Diligence

Does this change modify the behaviour of other functions? If so, which?

codecov bot commented Jun 15, 2020 • edited Loading

Codecov Report

Markus-Goetz Jun 17, 2020 • edited Loading

Choose a reason for hiding this comment

ClaudiaComito Jun 18, 2020

Choose a reason for hiding this comment

Markus-Goetz Jun 17, 2020

Choose a reason for hiding this comment

Markus-Goetz left a comment

Choose a reason for hiding this comment

Markus-Goetz left a comment

Choose a reason for hiding this comment

ClaudiaComito commented Jun 18, 2020

ClaudiaComito commented Jun 15, 2020 •

edited

Loading

codecov bot commented Jun 15, 2020 •

edited

Loading

Markus-Goetz Jun 17, 2020 •

edited

Loading