-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenBLAS or BLIS instead of ATLAS? #15
Comments
Forum user arif-ali used OpenBLAS and got 13 Gflops on the Pi 4, which beats my result of 11-ish with ATLAS: https://forums.raspberrypi.com/viewtopic.php?p=1674010#p1674010 Scripts available here: https://github.com/arif-ali/raspberrypi-hpl/tree/master/scripts |
More detail from aa3025: https://www.hydromag.eu/~aa3025/rpi/ |
ATLAS compile was taking absolutely forever on the big.LITTLE Orange Pi 5 RK3588s (I confirmed no throttling... not sure why it got hung up so long!). I decided to start working on this. With OpenBLAS:
That seems low, so I'm going to re-run my OpenBLAS setup on a Pi 4 model B to see how it compares to the ATLAS library. Note that OpenBLAS seems to support A72/A73, but doesn't have any optimizations for A76, and seems to be using the A55 optimizations since there are LITTLE cores on the RK3588s, and maybe it's only picking those up... |
On the Pi 4, I'm seeing 11.679 Gflops using OpenBLAS (compared to 11.774 Gflops for ATLAS):
(Note that it is tuned for A72...). |
Using Blis... got 11.889 Gflops, nice!
|
On the Orange Pi 5, using Blis:
|
After doing some more testing with Ampere's recommended HPL setup (with an Ampere-optimized BLIS library), I would like to investigate switching away from ATLAS.
The primary motivation is build speed. I've noticed some machines can compile in an hour or two, but others take 2-3 days (especially slower systems like the Raspberry Pi 4...).
That's not especially fun, but in the past I've stuck with this method thinking it will compile ATLAS in a way that is tuned to each specific processor the best. Supposedly. (Who understands all this math that well anyway?)
I would like to compare other options like OpenBLAS or BLIS to see:
The text was updated successfully, but these errors were encountered: