Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batched matrix multiplication. #1261

Conversation

FOsterfeld
Copy link
Member

@FOsterfeld FOsterfeld commented Nov 8, 2023

split dimension is a batch dimension

Due Diligence

  • General:
    • base branch must be main for new features, latest release branch (e.g. release/1.3.x) for bug fixes
    • title of the PR is suitable to appear in the Release Notes
  • Implementation:
    • unit tests: all split configurations tested
    • unit tests: multiple dtypes tested
    • documentation updated where needed

Description

Issue/s resolved: #890

Changes proposed:

Type of change

Memory requirements

Performance

Does this change modify the behaviour of other functions? If so, which?

yes / no

split dimension is a batch dimension
@mrfh92 mrfh92 marked this pull request as draft November 8, 2023 13:37
@ghost
Copy link

ghost commented Nov 8, 2023

👇 Click on the image for a new way to code review

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map legend

Copy link
Contributor

github-actions bot commented Nov 8, 2023

Thank you for the PR!

Copy link

codecov bot commented Nov 8, 2023

Codecov Report

Attention: Patch coverage is 97.36842% with 8 lines in your changes missing coverage. Please review.

Project coverage is 92.13%. Comparing base (bac864d) to head (fc97280).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
heat/core/linalg/basics.py 97.36% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1261      +/-   ##
==========================================
+ Coverage   92.07%   92.13%   +0.05%     
==========================================
  Files          83       83              
  Lines       12196    12163      -33     
==========================================
- Hits        11230    11206      -24     
+ Misses        966      957       -9     
Flag Coverage Δ
unit 92.13% <97.36%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -487,9 +487,12 @@ def matmul(a: DNDarray, b: DNDarray, allow_resplit: bool = False) -> DNDarray:
sanitation.sanitize_in(a)
sanitation.sanitize_in(b)

if a.gshape[-1] != b.gshape[0]:
batch_dim = max(a.ndim, b.ndim) - 2
batched = batch_dim > 2
Copy link
Collaborator

@mrfh92 mrfh92 Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe > 0 instead of >2 or did I understand sth wrong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, you're right, it should be 0.

@mrfh92
Copy link
Collaborator

mrfh92 commented Nov 13, 2023

Sieht gut aus soweit 👍

Vorschläge für die weitere Arbeit:

  • Wenn du oben neben "codecov" (rotes X) auf Details und dann auf "View this Pull Request on Codecov" gehst, kannst du sehen, welche Zeilen deiner Änderungen nicht getestet werden bisher. Entsprechende Tests kannst du in heat/core/linalg/tests/test_basics.py unterbringen.
    Mit diesem Fall wäre dann gebatchte Matrizenmultiplikation für den Fall abgedeckt, dass die LA-Dimensionen nicht gesplitted sind. Dies entspricht quasi dem Hinzufügen von Batchdimensionen zum Fall "2xsplit=None" im ursprünglichen Code.
  • Als nächstes könnten wir den Fall angehen, dass zu den anderen 8 (?) bestehenden Fällen Batch-dimensionen hinzugefügt werden; nun wäre allerdings die Batchdimension nicht gesplitted und dafür evtl. die LA-Dimension. Mit etwas Glück lässt sich das recht einfach aus dem bestehenden Code erzeugen, da ja PyTorch bereits gut mit Batches arbeiten kann.
  • Den Fall, dass einmal eine LA-Dimension gesplitted ist und einmal eine Batchdimension gesplitted ist, würde ich ausschließen und eine enstprechende Fehlermeldung ausgeben.

@LScheib LScheib assigned LScheib and FOsterfeld and unassigned LScheib Nov 13, 2023
Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

…gebra_for_arrays_with_dimension_2_in_particular_matmul
Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

@FOsterfeld FOsterfeld requested review from mtar and mrfh92 August 17, 2024 22:39
…gebra_for_arrays_with_dimension_2_in_particular_matmul
Copy link
Contributor

Thank you for the PR!

…gebra_for_arrays_with_dimension_2_in_particular_matmul
Copy link
Contributor

Thank you for the PR!

@mrfh92
Copy link
Collaborator

mrfh92 commented Aug 20, 2024

@FOsterfeld Something seems to have changed that lets the tests in QR fail. Can you reproduce this error on the workstation?

…gebra_for_arrays_with_dimension_2_in_particular_matmul
Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

@mrfh92 mrfh92 added the PR talk label Sep 1, 2024
@ClaudiaComito ClaudiaComito added this to the 1.5.0 milestone Sep 2, 2024
…gebra_for_arrays_with_dimension_2_in_particular_matmul
Copy link
Contributor

github-actions bot commented Sep 2, 2024

Thank you for the PR!

@JuanPedroGHM
Copy link
Member

Benchmarks results - Sponsored by perun

function mpi_ranks device metric value ref_value std % change type alert lower_quantile upper_quantile
matmul_split_0 4 CPU RUNTIME 0.142147 0.186948 0.0350541 -23.9647 jump-detection True nan nan
matmul_split_1 4 CPU RUNTIME 0.111954 0.134253 0.0147728 -16.6099 jump-detection True nan nan
concatenate 4 CPU RUNTIME 0.193592 0.161892 0.0454197 19.581 jump-detection True nan nan
matmul_split_0 4 GPU RUNTIME 0.0429973 0.058403 0.034678 -26.3782 jump-detection True nan nan
kmedoids 4 CPU RUNTIME 0.688199 1.23864 0.00419769 -44.4392 trend-deviation True 0.697841 1.99332
apply_inplace_max_abs_scaler_and_inverse 4 CPU RUNTIME 0.000499201 0.00151836 1.56914e-05 -67.1223 trend-deviation True 0.000501199 0.00351122
apply_inplace_robust_scaler_and_inverse 4 CPU RUNTIME 2.38697 5.38711 0.0112881 -55.6911 trend-deviation True 2.40194 9.88492

Grafana Dashboard
Last updated: 2024-09-02T12:38:56Z

@mrfh92 mrfh92 removed the PR talk label Sep 2, 2024
Copy link
Member

@JuanPedroGHM JuanPedroGHM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mrfh92
Copy link
Collaborator

mrfh92 commented Sep 4, 2024

@mtar have your suggestions been addressed by the changes?

@ClaudiaComito ClaudiaComito added the enhancement New feature or request label Sep 5, 2024
…gebra_for_arrays_with_dimension_2_in_particular_matmul
Copy link
Contributor

github-actions bot commented Sep 6, 2024

Thank you for the PR!

@JuanPedroGHM
Copy link
Member

Benchmarks results - Sponsored by perun

function mpi_ranks device metric value ref_value std % change type alert lower_quantile upper_quantile
matmul_split_1 4 CPU RUNTIME 0.122906 0.109381 0.0313848 12.3656 jump-detection True nan nan
concatenate 4 CPU RUNTIME 0.167615 0.196439 0.0260386 -14.6728 jump-detection True nan nan
apply_inplace_normalizer 4 CPU RUNTIME 0.00100724 0.00947982 8.56803e-05 -89.375 jump-detection True nan nan
matmul_split_0 4 GPU RUNTIME 0.0483279 0.0545347 0.0310459 -11.3814 jump-detection True nan nan
matmul_split_1 4 GPU RUNTIME 0.0367425 0.0272262 0.0292762 34.9526 jump-detection True nan nan
qr_split_1 4 CPU RUNTIME 0.179386 0.451767 0.00646491 -60.2924 trend-deviation True 0.179536 0.796189
kmeans 4 CPU RUNTIME 0.309642 0.909418 0.00239133 -65.9517 trend-deviation True 0.312577 1.79242
reshape 4 CPU RUNTIME 0.155737 0.365847 0.00110721 -57.431 trend-deviation True 0.156546 0.731662
lanczos 4 GPU RUNTIME 0.600791 0.586755 0.00316215 2.39205 trend-deviation True 0.579288 0.597313
kmeans 4 GPU RUNTIME 0.657029 0.632586 0.004188 3.86392 trend-deviation True 0.617689 0.64691
kmedians 4 GPU RUNTIME 1.11586 1.01658 0.0125118 9.76642 trend-deviation True 0.994589 1.04602
kmedoids 4 GPU RUNTIME 1.2525 1.14701 0.00847031 9.19655 trend-deviation True 1.12248 1.18111
apply_inplace_robust_scaler_and_inverse 4 GPU RUNTIME 5.99905 5.60869 0.0446267 6.95995 trend-deviation True 5.47179 5.75939
apply_inplace_normalizer 4 GPU RUNTIME 0.00179942 0.00169872 4.94232e-05 5.92757 trend-deviation True 0.00166147 0.00175474

Grafana Dashboard
Last updated: 2024-09-06T06:12:05Z

@ClaudiaComito ClaudiaComito merged commit 914e5c3 into main Sep 9, 2024
44 checks passed
@mtar mtar deleted the features/1104-Implement_consistent_linear_algebra_for_arrays_with_dimension_2_in_particular_matmul branch October 21, 2024 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

matmul on multidimensional arrays
6 participants