[SPARSE] Add support for sparse TRSV with MKLCPU backend by Rbiessy · Pull Request #407 · uxlfoundation/oneMath

Rbiessy · 2023-11-10T14:38:23Z

Description

Add support for sparse TRSV using MKLCPU.
Relies on #403

All Submissions

Do all unit tests pass locally? Attach a log.
Have you formatted the code using clang-format?

New features

Have you provided motivation for adding a new feature?
Have you added relevant tests?

Rbiessy · 2023-11-10T14:39:54Z

Tests using a recent MKL and DPCPP build: sparse_trsv_log.txt

src/sparse_blas/backends/mkl_common/mkl_operations.cxx

gajanan-choudhary

TRSV parts look fine to me. Depending on how the optimize_gemm discussions go in #374, this PR may or may not need further changes.

Co-authored-by: Gajanan Choudhary <gajanan@utexas.edu>

tests/unit_tests/sparse_blas/source/sparse_trsv_buffer.cpp

tests/unit_tests/sparse_blas/include/sparse_reference.hpp

tests/unit_tests/sparse_blas/source/sparse_trsv_buffer.cpp

tests/unit_tests/sparse_blas/source/sparse_trsv_usm.cpp

gajanan-choudhary

All files other than the unit tests are fine. I have a few suggestions for the tests.

tests/unit_tests/sparse_blas/source/sparse_trsv_buffer.cpp

tests/unit_tests/sparse_blas/source/sparse_trsv_usm.cpp

Rbiessy · 2023-11-21T14:47:25Z

Note for the reviewers, I have made a change that affects the other sparse_blas tests to introduce EXPECT_TRUE_OR_FUTURE_SKIP. I was concerned about the number of tests that weren't properly run due to GTEST_SKIP that would interrupt the test on the first configuration that is not supported.
Log running all the tests on CPU using the 2024.0 package: log_sparse_blas-cpu.txt

gajanan-choudhary · 2023-11-22T07:24:56Z

Note for the reviewers, I have made a change that affects the other sparse_blas tests to introduce EXPECT_TRUE_OR_FUTURE_SKIP. I was concerned about the number of tests that weren't properly run due to GTEST_SKIP that would interrupt the test on the first configuration that is not supported. Log running all the tests on CPU using the 2024.0 package: log_sparse_blas-cpu.txt

@Rbiessy, it looks from your log_sparse_blas-cpu.txt file that the only tests that ran were GEMV. All GEMM and TRSV tests appear to have been skipped. Shouldn't all of them be running and passing?

gajanan-choudhary

LGTM except for the EXPECT_TRUE_OR_FUTURE_SKIP related changes which make the test logs misleading because:

    bool skip = test_helper_transpose<fpType>(GetParam());
    if (skip) {
        // Mark that some tests were skipped
        GTEST_SKIP();
    }

If there are 10 input combinations that are run and 9 pass, 1 is skipped, then the entire test is marked as skipped. There's a way to also pass a string to GTEST_SKIP(). We should use that to clearly indicate how many tests passed, how many tests failed, and how many were skipped.

src/sparse_blas/backends/mkl_common/mkl_operations.cxx

gajanan-choudhary · 2023-11-22T08:04:05Z

Note for the reviewers, I have made a change that affects the other sparse_blas tests to introduce EXPECT_TRUE_OR_FUTURE_SKIP. I was concerned about the number of tests that weren't properly run due to GTEST_SKIP that would interrupt the test on the first configuration that is not supported. Log running all the tests on CPU using the 2024.0 package: log_sparse_blas-cpu.txt

@Rbiessy, it looks from your log_sparse_blas-cpu.txt file that the only tests that ran were GEMV. All GEMM and TRSV tests appear to have been skipped. Shouldn't all of them be running and passing?

I see what happened there. Multiple tests are run, but some may throw some acceptable exception such as unimplemented and is therefore skipped, but we mark the entire set in the test as "Skipped" with a GTEST_SKIP() call in the end. This is misleading, case in point being my own comment above and confusion before reviewing the changes. Can we come up with a better long-term solution for this? Each gtest run could report how many subtests were run, how many passed, how many were skipped, and how many failed. Maybe 3 variables could be used to track that along to summarize in the end?

For that matter, I don't think a throwing an acceptable exception such as unimplemented should be considered a skipped test, we should mark it as passed. Any opinions here, @spencerpatty?

gajanan-choudhary · 2023-11-22T08:09:59Z

For that matter, I don't think a throwing an acceptable exception such as unimplemented should be considered a skipped test, we should mark it as passed. Any opinions here, @spencerpatty?

On second thought, I take that back. I myself don't know what's the appropriate thing to do here. It is a skipped test, but if we mark that set as skipped, then it's misleading. If we mark it as passed, then it may appear to some as if there's no exception being thrown and that the input combination is supported. Maybe counting passed, skipped, failed tests is the way to go? Is there a better way?

Rbiessy · 2023-11-22T16:22:51Z

@gajanan-choudhary regarding the issue that you mention, I agree that this is not ideal but this is the current way the tests are done for other domains as well.
The proper solution would be to split the tests so that each test runs only 1 configuration. I considered improving this but I didn't want to spend more time than needed.
I was able to confirm that none of the tests are skipped with a recent MKL build.

gajanan-choudhary · 2023-11-28T09:39:06Z

I was able to confirm that none of the tests are skipped with a recent MKL build.

If none of the tests are skipped, then why do the log files such as log_sparse_blas-cpu.txt you posted earlier have "Skipped" written in them?

I understand that it is not that easy to have each test run just 1 configuration; that's fine, but at least can we have to logs print how many configurations were run, how many passed/failed/skipped? There's an option to pass a custom print to gtest_skip(), for instance. We could cascade the count of test statuses to the final return point.

tests/unit_tests/include/test_helper.hpp

Rbiessy · 2023-11-28T13:22:58Z

@gajanan-choudhary, the log I attached was using oneapi 2024.0. This one has skipped configurations for gemm when transpose_B != nontrans and for trsv if the matrix is transposed because the backend is throwing unsupported exceptions. If I use a more recent nightly build of MKL none of the tests are skipped on CPU.
I've applied your suggestion but it's not working well with ctest. I don't know why the skipped messages are not printed even with the extra-verbose flag. The skipped messages are printed if we run a binary ./bin/test_main_sparse_blas_ct directly and use the --terse_output flag.
Here are the logs using ctest log_cpu_2024_0.txt and another one with the the terse_output log_cpu_2024_0_terse_output.txt.
I was not able to run this last change on a nightly build anymore due to permissions issue. I don't think this is blocking this PR.

gajanan-choudhary

LGTM!

I do have a final minor suggestion of adding a comment above the gtest_skip() call indicating that the printing may not work well with ctest, in which case running the binary with the --terse-output flag should print things. Just so that this info doesn't get lost in history if someone tries digging in!

Thanks for going above and beyond in seeing through my change requests on this PR. I really appreciate your effort in making the product better! Great work on the PR!

tests/unit_tests/sparse_blas/source/sparse_gemm_buffer.cpp

…on#407) Co-authored-by: Gajanan Choudhary <gajanan@utexas.edu>

gajanan-choudhary changed the title ~~[SPARSE] Add support for sparse trsv with MKLCPU backend~~ [SPARSE] Add support for sparse TRSV with MKLCPU backend Nov 13, 2023

gajanan-choudhary assigned Rbiessy Nov 13, 2023

gajanan-choudhary reviewed Nov 13, 2023

View reviewed changes

src/sparse_blas/backends/mkl_common/mkl_operations.cxx Outdated Show resolved Hide resolved

gajanan-choudhary reviewed Nov 13, 2023

View reviewed changes

Rbiessy and others added 5 commits November 15, 2023 10:25

[SPARSE] Add support for sparse trsv with MKLCPU backend

fea1f7c

Update comment

a660bef

Co-authored-by: Gajanan Choudhary <gajanan@utexas.edu>

Add tests with and without optimize_trsv

4f1ceb5

Remove TODO

2476c46

clang-format

cef21ce

Rbiessy mentioned this pull request Nov 17, 2023

[SPARSE] Add support for sparse MKLGPU backend #410

Merged

4 tasks

hjabird reviewed Nov 17, 2023

View reviewed changes

Rework includes

9f420d9

gajanan-choudhary reviewed Nov 20, 2023

View reviewed changes

tests/unit_tests/sparse_blas/source/sparse_trsv_buffer.cpp Outdated Show resolved Hide resolved

tests/unit_tests/sparse_blas/source/sparse_trsv_buffer.cpp Outdated Show resolved Hide resolved

tests/unit_tests/sparse_blas/source/sparse_trsv_usm.cpp Outdated Show resolved Hide resolved

gajanan-choudhary reviewed Nov 20, 2023

View reviewed changes

tests/unit_tests/sparse_blas/source/sparse_trsv_usm.cpp Outdated Show resolved Hide resolved

Rbiessy added 7 commits November 20, 2023 13:43

Shuffle data only if optimize_trsv is called

a762d79

Increase trsv test size

3d1f1e4

Test lower and upper optimize_trsv

a9e4a30

Introduce EXPECT_TRUE_OR_FUTURE_SKIP

7e3dbf3

Require diagonal values for nonunit only

c1fc824

Throw unimplemented for transposed trsv

aad6bd1

clang-format

34bf0e1

gajanan-choudhary reviewed Nov 22, 2023

View reviewed changes

src/sparse_blas/backends/mkl_common/mkl_operations.cxx Show resolved Hide resolved

src/sparse_blas/backends/mkl_common/mkl_operations.cxx Show resolved Hide resolved

src/sparse_blas/backends/mkl_common/mkl_operations.cxx Show resolved Hide resolved

Add TODOs

e49f285

hjabird approved these changes Nov 23, 2023

View reviewed changes

gajanan-choudhary reviewed Nov 28, 2023

View reviewed changes

tests/unit_tests/include/test_helper.hpp Outdated Show resolved Hide resolved

Print number of configurations skipped

7954345

gajanan-choudhary approved these changes Nov 28, 2023

View reviewed changes

tests/unit_tests/sparse_blas/source/sparse_gemm_buffer.cpp Show resolved Hide resolved

Add comment

c558637

gajanan-choudhary approved these changes Nov 29, 2023

View reviewed changes

Rbiessy merged commit d56a0f1 into uxlfoundation:develop Nov 30, 2023

Rbiessy deleted the romain/sparse_trsv branch November 30, 2023 11:41

normallytangent pushed a commit to normallytangent/oneMKL that referenced this pull request Aug 6, 2024

[SPARSE] Add support for sparse TRSV with MKLCPU backend (uxlfoundati…

546c056

…on#407) Co-authored-by: Gajanan Choudhary <gajanan@utexas.edu>

Comments

Conversation

Rbiessy commented Nov 10, 2023 • edited by gajanan-choudhary Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

All Submissions

New features

Uh oh!

Rbiessy commented Nov 10, 2023

Uh oh!

Uh oh!

gajanan-choudhary left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gajanan-choudhary left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Rbiessy commented Nov 21, 2023

Uh oh!

gajanan-choudhary commented Nov 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gajanan-choudhary left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gajanan-choudhary commented Nov 22, 2023

Uh oh!

gajanan-choudhary commented Nov 22, 2023

Uh oh!

Rbiessy commented Nov 22, 2023

Uh oh!

gajanan-choudhary commented Nov 28, 2023

Uh oh!

Uh oh!

Rbiessy commented Nov 28, 2023

Uh oh!

gajanan-choudhary left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Rbiessy commented Nov 10, 2023 •

edited by gajanan-choudhary

Loading

gajanan-choudhary commented Nov 22, 2023 •

edited

Loading