Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Shard benchmarks across multiple CI jobs #92

Open
overlookmotel opened this issue Feb 8, 2024 · 8 comments
Open

Feature request: Shard benchmarks across multiple CI jobs #92

overlookmotel opened this issue Feb 8, 2024 · 8 comments

Comments

@overlookmotel
Copy link

OXC Javascript compiler is using CodSpeed for measuring performance. For a performance-focused project like OXC, CodSpeed is absolutely amazing.

However, running the benchmarks is currently by far the project's slowest CI job.

Would it be feasible to enable sharding the benchmarks across multiple CI jobs, to shorten both the build and execution time of each, and then a final job to pull the results together and upload to CodSpeed?

@adriencaccia
Copy link
Member

Hey @overlookmotel, thanks for the kind words!

To reduce the build time for everyone, we will definitely look at what @Boshen did in oxc-project/oxc#2343. This seems like something that could easily be implemented directly in this repo, so that everyone gets lower build time.

As far as sharding the benchmarks across multiple CI jobs, this is something that we will have to look into. At the moment CodSpeed only supports a single upload per run, so we would need to add some kind of sharded uploads per run or find a way to generate reports in multiple CI workers, then in a final job aggregate the reports and make a single upload.

The former solution would require a lot of rework on our app and take a lot of time. The later might be more hackable, since the changes would only need to be done on our open-source https://github.com/CodSpeedHQ/runner and https://github.com/CodSpeedHQ/action repositories

@overlookmotel
Copy link
Author

We've implemented sharding via a rather hacky method on oxc-project/oxc#2751.

Now that can see it works, I'd like to "do it properly" and submit PRs against https://github.com/CodSpeedHQ/runner and https://github.com/CodSpeedHQ/action so everyone can benefit.

A few questions:

.out files

The .out files are named <pid>.out. Do the file names matter at your end?

Because each shard runs on a different machine, it's possible for 2 jobs to have same PID, and therefore to end up with 2 x 1234.pid files. I had to add some logic for spotting duplicates and renaming them:

https://github.com/oxc-project/oxc/blob/95ac2655047ab0692337cc1059b01f56b4ba76d8/tasks/benchmark/codspeed/upload.mjs#L69-L84

But maybe the filenames have no semantic meaning to your back end, so could more easily get unique filenames by prepending a shard ID on the start of each e.g. shard1-1234.out, shard2-1234.out?

Log files

Are the .log files which are included in the profile archive required / useful to you? If they are, they'd need to be merged somehow, or included as runner_shard1.log, runner_shard2.log etc. Would your back end accept that?

Metadata

Presumably the metadata JSON which is sent to CodSpeed's API can be generated on any job. For Github Actions at least, it looks like it only contains info which relates to the PR/branch, not the GH Actions job. Is that correct?

API

Implementation I had in mind:

Runner:

  • Add profile-directory CLI switch to specify where the profile data gets written to. The action can read the .out files from this directory.
  • Add upload-only CLI switch to skip running anything and just upload profile data from profile-directory.

Action:

  • Add param shard which receives a unique ID for a shard. If provided, the action runs codspeed-runner --profile-directory <whatever> --no-upload and then saves content of the profile directory as an artefact.
  • Add param shard-finish which downloads artefacts, combines them, and runs codspeed-runner --profile-directory <whatever> --upload-only

Does this sound OK to you?

@adriencaccia
Copy link
Member

Hey @overlookmotel, sorry for the delay in my response.

We talked about it internally and we want to directly support multiple uploads in our backend.
This will mask a lot of the sharding logic in the action and the runner, and allow for easier integration with other providers.

Thanks a lot for the work you already did on this 🙏 and I am sorry that we do not continue down the single upload way.
We are using your work as inspiration for the new specs.

I will update you when we have them, and if you are still willing, it will be a pleasure to have you contribute to the runner/action!

@overlookmotel
Copy link
Author

overlookmotel commented Mar 29, 2024

OK cool. That does sound like an ideal solution.

If I can add a couple of suggestions:

1. Results on completion

Some benchmarks (shards) will may take much less time to run than others (that's certainly the case with Oxc). It'd be ideal if results from completed shards were visible on CodSpeed as soon as they're uploaded, and the rest show as "Running..." - as opposed to have to wait for all the shards to finish running before you can see the results for any of them.

2. Handle missing benchmarks

Oxc has multiple crates. Currently the benchmarks for crate X run even if the only changes in the PR are to crate Y (and X doesn't depend on Y). This is wasteful of CI resources, and I'd like to run only the benchmarks for the crates which could be affected by the code changes in the PR.

However, if you just prevent some benchmarks from running, then they show on Codspeed as "Missing" and it makes everything look red and bad!

To work around this, I have hacked together something to cache benchmarks results from a previous run and re-upload them (without actually running the benchmark again):

https://github.com/oxc-project/oxc/blob/586e823fd6c2c1bb2bdd1164447def0d7cfccfde/tasks/benchmark/codspeed/upload.mjs#L40-L55

This isn't the right way to do it of course, but it'd be nice if CodSpeed could somehow handle this scenario where not all benchmarks run, but that is intentional rather than a problem.

@adriencaccia
Copy link
Member

adriencaccia commented Mar 29, 2024

1. Results on completion

Some benchmarks (shards) will may take much less time to run than others (that's certainly the case with Oxc). It'd be ideal if results from completed shards were visible on CodSpeed as soon as they're uploaded, and the rest show as "Running..." - as opposed to have to wait for all the shards to finish running before you can see the results for any of them.

Definitely, we plan to do incremented report updates. This is one of the upsides of allowing multiple uploads for a single run.

2. Handle missing benchmarks

Oxc has multiple crates. Currently the benchmarks for crate X run even if the only changes in the PR are to crate Y (and X doesn't depend on Y). This is wasteful of CI resources, and I'd like to run only the benchmarks for the crates which could be affected by the code changes in the PR.

We have thought about it in the past, but we could not find a way to implement this at that moment.
How can we know if the non-uploaded benchmarks are dropped or deliberately not run, in a similar use case as you outlined?
Handling missing benchmarks will require us to find a way to differentiate between the two cases.

@overlookmotel
Copy link
Author

How can we know if the non-uploaded benchmarks are dropped or deliberately not run, in a similar use case as you outlined?

Yes, that's true - the ambiguity is a problem. Maybe if the action allowed you to input that information. e.g. skipped: ["bench1", "bench2"]?

I guess you will have this problem in some form regardless of my request. How can CodSpeed backend know when the run is done and it's time to generate the report comment? e.g.:

  • Is benchX still running, or was it dropped?
  • Have all shards finished running, or has a new benchmark been added but it's not completed yet?

I suspect one of these will be required:

  • An initial call to the backend saying "expect the following results"
  • Or a call at the end saying "all runs now completed"

Either way, that'd be a good place to add a message "benchX is skipped, not dropped".

@colin-ho
Copy link

Hey just wanted to check if there's any updates to supporting sharded benchmarks? We just integrated CodSpeed into Daft: Eventual-Inc/Daft#2696 and I think it'll be really useful!

@adriencaccia
Copy link
Member

Thanks for the interest @colin-ho, glad to have Daft using CodSpeed!

Sharding is still on our roadmap, we will update this issue when it is relevant.

@art049 art049 closed this as not planned Won't fix, can't repro, duplicate, stale Sep 5, 2024
@art049 art049 reopened this Sep 5, 2024
@art049 art049 added the Migrated label Nov 25, 2024 — with Linear
@art049 art049 removed the Migrated label Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants