-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update buildkite, manifests, github action workflows #444
Conversation
GPU tests on the CI seems to be taking much longer (https://buildkite.com/clima/rrtmgp-ci/builds/573#018d9080-12e0-43b2-bdfb-2e03fe406ff7) compared to the latest |
8c49cb6
to
5fb3b98
Compare
The buildkite pipeline had several problems. I fixed them and now most jobs are twice as fast. The GPU unit test seems to be the only one adversely affected. @sriharshakandala, do you want to have a look at this? https://buildkite.com/clima/rrtmgp-ci/builds/582#018d9acf-9053-433b-8a76-a0593b20f8d9 |
Changes overall look good to me, except a couple items in the project toml |
38f8790
to
bd6d616
Compare
24519cd
to
20fc32a
Compare
I consildated the environments to only have |
@charleskawczynski do you have any idea what could be the reason behind this increase in time https://buildkite.com/clima/rrtmgp-ci/builds/592#018d9ed1-b8ac-4121-8118-2d3930baa764 compared to main? It happens only on buildkite, @sriharshakandala ran the code on the cluster and found the same speed as main |
9e6c960
to
04499cb
Compare
df61ee9
to
ceabf07
Compare
I spent 3 more hours on this and I narrowed down the problem the CUDA updates. I can reproduce on the cluster on the P100 when I use CUDA 5.2, but it still fast when using CUDA 5.1. Fast:
Slow:
Only changes:
I also checked that using the system and the artifact runtime produce the same results. @sriharshakandala do you want to take this on and investigate further? |
I'm going to rebase this PR, cc @Sbozzolo |
ceabf07
to
341d4a5
Compare
341d4a5
to
4ba2013
Compare
No description provided.