Skip to content

Fix BLAS and LAPACK tests for RVV 1.0 target, update to 0.12.0 intrincics #4456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 26, 2024

Conversation

kseniyazaytseva
Copy link
Contributor

@kseniyazaytseva kseniyazaytseva commented Jan 24, 2024

  • Update intrincics API to 0.12.0 version (Stride Segment Loads/Stores)
  • Fixed nrm2, axpby, ncopy, zgemv and scal kernels
  • Added zero size checks
  • Increase CGEMM_DEFAULT_UNROLL_MN (previously caused failures)

For testing, target x280 was build without --fast-math flag. This flag causes failures and hanging in utest and lapak tests.

All BLAS tests passed.

x280 LAPACK tests:

SUMMARY nb test run numerical error other error
================ =========== ================= ================
REAL 1327023 0 (0.000%) 0 (0.000%)
DOUBLE PRECISION 1327845 0 (0.000%) 0 (0.000%)
COMPLEX 786775 0 (0.000%) 0 (0.000%)
COMPLEX16 787842 0 (0.000%) 0 (0.000%)

--> ALL PRECISIONS 4229485 0 (0.000%) 0 (0.000%)

RISCV64_ZVL128B LAPACK tests:

SUMMARY nb test run numerical error other error
================ =========== ================= ================
REAL 1327023 0 (0.000%) 0 (0.000%)
DOUBLE PRECISION 1326753 12 (0.001%) 0 (0.000%)
COMPLEX 786775 0 (0.000%) 0 (0.000%)
COMPLEX16 787842 0 (0.000%) 0 (0.000%)

--> ALL PRECISIONS 4228393 12 (0.000%) 0 (0.000%)

…cics

* Update intrincics API to 0.12.0 version (Stride Segment Loads/Stores)
* Fixed nrm2, axpby, ncopy, zgemv and scal kernels
* Added zero size checks
@martin-frbg martin-frbg merged commit 889c5d0 into OpenMathLib:risc-v Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants