Skip to content

Multi agent parallel testing in CI #18523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
May 22, 2025
Merged

Multi agent parallel testing in CI #18523

merged 24 commits into from
May 22, 2025

Conversation

majocha
Copy link
Contributor

@majocha majocha commented Apr 29, 2025

Make #18517 work.

Copy link
Contributor

✅ No release notes required

@majocha
Copy link
Contributor Author

majocha commented Apr 29, 2025

This kind of worked, but now Linux is the bottleneck, oh well :)

@T-Gro
Copy link
Member

T-Gro commented Apr 29, 2025

Linux and MacOs will likely alternate in being the slowest, MacOs also has the smallest pool of available machines.

I do like the numbers very much.
I will be happy to get this in on a per-partes basis ;; always focusing on the currently slowest leg(s).

@majocha
Copy link
Contributor Author

majocha commented Apr 29, 2025

Now it's important to verify the that number of tests add up correctly 😅
image

@T-Gro
Copy link
Member

T-Gro commented Apr 29, 2025

.. number of unique executed tests.

I would recommend to download testlog.xml files and have a script count unique test entries and compare two sets (prior exection, new execution from 4 jobs)

@majocha
Copy link
Contributor Author

majocha commented May 5, 2025

So I had some fun splitting the build and test phases into separate jobs.
Nominally the run times are nice, but it doesn't seem very beneficial as for now, because there is a significant wait time when starting a windows vm agent.

@majocha
Copy link
Contributor Author

majocha commented May 5, 2025

I verified that in fact all tests do have a batch number trait, and none are left out when run in batches.

The failsafe, in case this ever becomes problematic, is to remove the -testBatch x from script invocations and it will run all the tests without filtering.

As for the build / test split into separate jobs, I'm not sold on this. it's a matter of reverting the single commit.

@majocha majocha marked this pull request as ready for review May 5, 2025 09:08
@majocha majocha requested a review from a team as a code owner May 5, 2025 09:08
@majocha
Copy link
Contributor Author

majocha commented May 5, 2025

For future reference:
image
image

@T-Gro
Copy link
Member

T-Gro commented May 12, 2025

As for the build / test split into separate jobs, I'm not sold on this. it's a matter of reverting the single commit.

Indeed, the overall time of the CI (and not duration of individual jobs) is the same, if not slower.
If we ever feel the pressure to reduce the resource utilization here, it is a step we could undertake.

=> I think we can keep each leg do it's own build.

@majocha
Copy link
Contributor Author

majocha commented May 12, 2025

Indeed, the overall time of the CI (and not duration of individual jobs) is the same, if not slower.

Yes much of this comes from waiting for the windows vms to get running. There is significant wait time between the build and test phases because of this.

=> I think we can keep each leg do it's own build.

I'm away but I'll restore this when I get back 🙂

@T-Gro T-Gro marked this pull request as draft May 14, 2025 07:38
@T-Gro
Copy link
Member

T-Gro commented May 14, 2025

Indeed, the overall time of the CI (and not duration of individual jobs) is the same, if not slower.

Yes much of this comes from waiting for the windows vms to get running. There is significant wait time between the build and test phases because of this.

=> I think we can keep each leg do it's own build.

I'm away but I'll restore this when I get back 🙂

Thanks @majocha . Please mark it as ready for review onace that is changed, I am eager to have this in 👍

@majocha majocha force-pushed the multi-agent-ci branch 2 times, most recently from ff7fa06 to 88be2a4 Compare May 15, 2025 22:06
@majocha majocha marked this pull request as ready for review May 16, 2025 09:18
@majocha
Copy link
Contributor Author

majocha commented May 16, 2025

This is ready, overall CI run time is now around 48 minutes. It is determined by the slowest -testVs leg, which is unfortunately not parallelized at all.

@majocha
Copy link
Contributor Author

majocha commented May 21, 2025

I had no luck trying to just split VS unit tests, which are the slowest now, into batches with this approach. Tests fail, hang and so on. Theoretically it should work with less granularity: by test class instead of test case. So, there is still room for improvement.

@github-project-automation github-project-automation bot moved this from New to In Progress in F# Compiler and Tooling May 22, 2025
@T-Gro
Copy link
Member

T-Gro commented May 22, 2025

I had no luck trying to just split VS unit tests, which are the slowest now, into batches with this approach. Tests fail, hang and so on. Theoretically it should work with less granularity: by test class instead of test case. So, there is still room for improvement.

Thanks for trying @majocha .
This might need a more manual (cherry picked) approach to figure out a working split. The tests are interdependent and work with a lot of dependencies we cannot control.

@T-Gro T-Gro enabled auto-merge (squash) May 22, 2025 07:26
@T-Gro T-Gro merged commit 7aa39f3 into dotnet:main May 22, 2025
39 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in F# Compiler and Tooling May 22, 2025
@T-Gro
Copy link
Member

T-Gro commented May 22, 2025

This is a big improvement for all F# contributors, I am super happy to see this being merged!
Great job @majocha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants