-
Notifications
You must be signed in to change notification settings - Fork 65
[ML] Upgrade to gcc 10.3 for Linux compilation #2028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
gcc 10.3 contains the fix for the bug that hinders compilation of PyTorch on aarch64. The binutils package is also upgraded, from version 2.34 to version 2.37, and patchelf is upgraded from version 0.10 to version 0.13.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
A retest now should work on linux-x86_64, and only fail on linux-aarch64. |
retest |
There are two test failures for
So it seems that gcc 10.3 has changed the way some maths calculations are done, but only on aarch64. |
There is also another problem to be investigated, which is why are we accidentally shipping |
Looking in more detail at:
The code actually prints out the relevant values immediately before the assertion, so we can see the values for all 4 platforms where we ran tests in the last PR build for this PR:
This shows there's a lot of variation in the clustered log value. linux-aarch64 now falls outside the tolerated limit. The last successful PR build (obviously not for this PR, but on the main branch) for linux-aarch64 using gcc 9.3 produced these values:
So linux-aarch64 always had the most negative value of all platforms for clustered log. It's just that upgrading to gcc 10.3 has finally pushed it past the point where the assertion fails. /cc @tveasey |
…ing with same compiler
Latest CentOS 7 uses binutils 2.27, so the cross compiler should be bootstrapped with this too, so that the native and cross components use matching versions. (Original CentOS 7 used the older binutils version 2.25, which is why the Dockerfile originally used that.)
The base image needs to match what the source code expects, which is version 19 since elastic#2028 was merged.
The base image needs to match what the source code expects, which is version 19 since #2028 was merged.
gcc 10.3 contains the fix for the bug that hinders
compilation of PyTorch on aarch64.
The binutils package used for compiling the final
distribution is also upgraded, from version 2.34 to
version 2.37, and the version used for bootstrapping
the cross compiler from version 2.25 to version 2.27.
patchelf is upgraded from version 0.10 to version
0.13.