Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve twister performance when parallel execution is available #52701

Open
yperess opened this issue Dec 1, 2022 · 7 comments
Open

Improve twister performance when parallel execution is available #52701

yperess opened this issue Dec 1, 2022 · 7 comments
Assignees
Labels
area: Continuous Integration area: Twister Twister Enhancement Changes/Updates/Additions to existing features

Comments

@yperess
Copy link
Collaborator

yperess commented Dec 1, 2022

Is your enhancement proposal related to a problem? Please describe.
In our test writing we have an issue where creating a new variant of a test (a new binary) has a lot of boilerplate overhead + build times but the cost of piling yet another test is also getting too much. The test binaries end up executing hundreds of tests and taking a long time.

Describe the solution you'd like
I'd like twister to be able to take the final built .elf file and shard it. Effectively, taking the elf file and modifying the ztest suite and test iterable sections, running the different shards in parallel, then combining the results. When running twister, the following steps should take place:

  1. Twister should build the binary as usual
  2. Twister should identify if parallelism is possible
  3. Twister should parse the .elf file and generate test metadata by iterating over the suite and test sections
  4. Twister should assign a weight to each suite based on the number of tests it has, then make some attempt at balancing the loads (I don't think we need to worry about splitting up suites yet).
  5. Twister should copy the .elf file and mutate the start/end pointers as well as shifting some suite data such that the binary is unaware of the other suites.
  6. Twister should run the various .elf files (locally, on QEMUs, or on hardware) in parallel then join the results

Describe alternatives you've considered
I've considered having an easier way of specifying a similar binary in the testcase.yaml file, but the only way I seem to be able to do that is by introducing a Kconfig to select which test suites to include into the binary, this ends up being a little confusing and forces the test writers to have to manage the test binaries by hand.

@yperess yperess added Enhancement Changes/Updates/Additions to existing features area: Twister Twister area: Continuous Integration labels Dec 1, 2022
@yperess
Copy link
Collaborator Author

yperess commented Dec 1, 2022

@tristan-google

@gmarull
Copy link
Member

gmarull commented Dec 1, 2022

@PerMac isn't this something twister V2 already supports using https://pypi.org/project/pytest-parallel/ ?

@tristan-google
Copy link
Collaborator

tristan-google commented Dec 1, 2022

I think Yuval's idea here doesn't concern the handlers for actually running stuff in parallel but rather the ability to take a giant testcase binary and sharding it post-build but pre-test runtime. So the executable would be duplicated N times and each copy would be modified to only run roughly 1/N of the tests, where N can be derived from the number of cores, number of matching HW boards plugged in, etc. At that point those N copies could be run in parallel through whatever handler/mechanism is in place just as if they were independent testcases.

@PerMac
Copy link
Member

PerMac commented Dec 2, 2022

I am not sure if I get the full idea. Would your idea require:

  • being able to call any single ztest test case from an application loaded on a board, instead of running full test suite from start to the bottom?

@PerMac
Copy link
Member

PerMac commented Dec 2, 2022

Another question: would there be a place in ztest framework to allow communication and calling single tests? E.g something like

dut.flash('zephyr.elf'')
dut.write('ztest call testcase_A')
output = dut.read()
assert output == "testcase_A PASS"

Where dut.write, dut.read would be handled in twister for serial communication with dut, and the flashed ztest application would handle what and how to call, when a command on the serial input is given?

@yperess
Copy link
Collaborator Author

yperess commented Dec 6, 2022

@PerMac not quite, Twister doesn't need to communicate with the dut. Prior to flashing, twister will identify if parallelism is possible, that means:

  • If the application is on native_posix or unit_testing boards
  • If the application is going to run on a QEMU
  • If multiple of the same DUT are connected that can be flashed

If any of the above is a yes, then we can parallelize the execution of the binary. Lets assume the binary has 30 test suites and each suite has 10 tests for a total of 300 tests (I believe we're just over that for our largest integration test binary). Twister would build the test binary as usual, then decide "how parallel things can be". For example, in the case of a native_posix test we can parallelize by the number of cores (lets assume a high number running on a CI server). So Twister:

  1. Copies the .elf file 30 times (since min(30 suites, 96 cores)).
  2. For each .elf copy it mutates zTest's _ztest_suite_node_list_start and _ztest_suite_node_list_end so that the binary is completely unaware of the other tests.
  3. It would then run each of the 30 binaries in parallel giving the results for the 1 suite
  4. It would then combine the results.

Some consideration would need to be taken for QEMUs and DUTs as the cost of flashing becomes greater (it might not make sense to split a binary with only 2 suites on a DUT test) but I believe we can tweak these heuristics as we get closer to feature complete.

NOTE

We have some very large integration tests and currently developers have to choose between the convenience of adding their test to the same binary (which bloats it even more) or go through the boilerplate of creating another binary for their test (with no real configuration changes). This leads to a very large discrepancy in run times where locally I'm seeing some tests run in milliseconds while our largest 2 tests run in 65 seconds. The issue is even worse in our CI where disk I/O is slower and the large handler.log writing is the bottleneck pushing those larger binaries closer to 120 seconds.

@zephyrbot
Copy link
Collaborator

Hi @tristan-google,

This issue, marked as an Enhancement, was opened a while ago and did not get any traction. Please confirm the issue is correctly assigned and re-assign it otherwise.

Please take a moment to review if the issue is still relevant to the project. If it is, please provide feedback and direction on how to move forward. If it is not, has already been addressed, is a duplicate, or is no longer relevant, please close it with a short comment explaining the reason.

@yperess you are also encouraged to help moving this issue forward by providing additional information and confirming this request/issue is still relevant to you.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Continuous Integration area: Twister Twister Enhancement Changes/Updates/Additions to existing features
Projects
None yet
Development

No branches or pull requests

5 participants