-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce CI Workload by Removing Some Spark Variants and Using Callable Workflows for Github Actions #5153
Comments
There's a few small changes I can make so that more Spark workflows will terminate early if one fails, but that won't necessarily resolve the problem entirely by any means. |
+1 on the above, along with this we can also leverage the GitHub Actions resources from the forked repositories instead of using the resources in ASF organisation at GitHub. This is what Apache Spark does presently :
relevant PR in spark : Would love to know your thoughts on the same :) |
I happened to read the related links. Thanks @singhpk234 for elaborating Spark's CI. To be more clear, apache/spark#32092 implemented the logic you explained. After that, I also implemented the logic to leverage GitHub check status (apache/spark#32193). |
In this way, we can remove all the overhead in the current repo, and leverage the resources from the forked repositories. I am willing to help and review if someone tries to pick this changes to Iceberg :-). |
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible. |
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' |
We now have 30 Github Actions that run as part of the CI test suite, and it's starting to have a noticeable impact on CI runners.
We test Spark with a large number of combinations of Java versions and Scala versions.
We previously only tested the "latest" Spark version (i.e. Spark 3.2) with Scala 2.13.
We are now testing:
That brings a total of 13 Spark specific CI variants that run on every PR that touches
core
orspark
.We should consider reducing the large number of combinations of JRE versions with Scala versions that are run for the various Spark versions, as CI is starting to take a good while longer.
We should also look into (again) refactoring out CI test suites to using callable workflows, such that all tests stem from one root test (very much like an Airflow DAG), so that if any one test fails, they all stop. We get this for free at present for any set of CI suites generated out of one
matrix
(such as java 11 and java 8 with scala 12).This will reduce the number of CI slots that are running for tests that will have to be run again (as something else failed).
We can also set up the faster tests first, to ensure they pass, before then calling out to the more expensive tests (such as Spark / Flink etc).
I tried before with the callable workflow, but at the time it wasn't worth the effort. I think now it probably is.
The text was updated successfully, but these errors were encountered: