Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/thread_float: do not overload slow MCUs with IRQs #18632

Merged
merged 1 commit into from
Jan 4, 2023

Conversation

maribu
Copy link
Member

@maribu maribu commented Sep 23, 2022

Contribution description

If the regular context switches are triggered too fast, slow MCUs will be able to spent little time on actually progressing in the test. This will scale the IRQ rate with the CPU clock as a crude way too keep load within limits.

Testing procedure

The unit test should now pass on the Microduino CoreRF

$ make BOARD=microduino-corerf AVRDUDE_PROGRAMMER=dragon_jtag -C tests/thread_float flash test
make: Entering directory '/home/maribu/Repos/software/RIOT/tests/thread_float'
Building application "tests_thread_float" for "microduino-corerf" with MCU "atmega128rfa1".
[...]
   text	  data	   bss	   dec	   hex	filename
  12834	   520	  3003	 16357	  3fe5	/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.elf
avrdude -c dragon_jtag -p m128rfa1  -U flash:w:/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.hex
[...]
Welcome to pyterm!
Type '/exit' to exit.
READY
s
START
main(): This is RIOT! (Version: 2022.10-devel-858-g18566-tests/thread_float)
THREADS CREATED

Context switch every 3125 µs
{ "threads": [{ "name": "idle", "stack_size": 192, "stack_used": 88 }]}
{ "threads": [{ "name": "main", "stack_size": 640, "stack_used": 220 }]}
THREAD t1 start
THREAD t2 start
THREAD t3 start
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770

make: Leaving directory '/home/maribu/Repos/software/RIOT/tests/thread_float'

(Note: The idle thread exiting is something that should never occur. I guess the culprit may be cpu_switch_context_exit() messing things up when the main thread exits. But that is not directly related to what this PR aims to fix. Adding a thread_sleep() at the end of main() does indeed prevent the idle thread from exiting.
Update: That's expected. The idle thread stats are printed on exit of the main thread, the idle thread does not actually exit.)

Issues/PRs references

Fixes #16908 maybe?

@github-actions github-actions bot added the Area: tests Area: tests and testing framework label Sep 23, 2022
@benpicco benpicco added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Oct 5, 2022

/* Let's not overwhelm boards by firing IRQs faster than they can handle and
* give them 50 billion CPU cycles per timeout */
static uint32_t timeout = 50000000000U / CLOCK_CORECLOCK;
Copy link
Contributor

@benpicco benpicco Oct 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to chose this dynamically or could we just get away with a lower switching frequency for all CPUs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test script waits for n prints from the board. Super fast MCUs might be able to power through the n prints with little to no context switches till then, if we go for a context switch rate that is slow enough for the slowest MCU. We could adapt the test script to always wait e.g. 10 seconds instead for n prints. But I liked the fact that the test was only slow for slow boards (where the additional time really was needed to get some confidence).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well CI does not like the static const since CLOCK_CORECLOCK is sometimes a function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, somehow the rebase has gone wrong... This should be a stack allocated variable in main now...

@maribu maribu added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch labels Dec 5, 2022
@riot-ci
Copy link

riot-ci commented Dec 6, 2022

Murdock results

✔️ PASSED

0835466 tests/thread_float: do not overload slow MCUs with IRQs

Success Failures Total Runtime
15 0 15 01m:07s

Artifacts

Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors merge

bors bot added a commit that referenced this pull request Jan 3, 2023
18632: tests/thread_float: do not overload slow MCUs with IRQs r=benpicco a=maribu

### Contribution description

If the regular context switches are triggered too fast, slow MCUs will be able to spent little time on actually progressing in the test. This will scale the IRQ rate with the CPU clock as a crude way too keep load within limits.

### Testing procedure

The unit test should now pass on the Microduino CoreRF

```
$ make BOARD=microduino-corerf AVRDUDE_PROGRAMMER=dragon_jtag -C tests/thread_float flash test
make: Entering directory '/home/maribu/Repos/software/RIOT/tests/thread_float'
Building application "tests_thread_float" for "microduino-corerf" with MCU "atmega128rfa1".
[...]
   text	  data	   bss	   dec	   hex	filename
  12834	   520	  3003	 16357	  3fe5	/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.elf
avrdude -c dragon_jtag -p m128rfa1  -U flash:w:/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.hex
[...]
Welcome to pyterm!
Type '/exit' to exit.
READY
s
START
main(): This is RIOT! (Version: 2022.10-devel-858-g18566-tests/thread_float)
THREADS CREATED

Context switch every 3125 µs
{ "threads": [{ "name": "idle", "stack_size": 192, "stack_used": 88 }]}
{ "threads": [{ "name": "main", "stack_size": 640, "stack_used": 220 }]}
THREAD t1 start
THREAD t2 start
THREAD t3 start
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770

make: Leaving directory '/home/maribu/Repos/software/RIOT/tests/thread_float'
```

(~~Note: The idle thread exiting is something that should never occur. I guess the culprit may be `cpu_switch_context_exit()` messing things up when the main thread exits. But that is not directly related to what this PR aims to fix. Adding a `thread_sleep()` at the end of `main()` does indeed prevent the idle thread from exiting.~~
Update: That's expected. The idle thread stats are printed on exit of the main thread, the idle thread does not actually exit.)

### Issues/PRs references

Fixes #16908 maybe?

18950: tests/unittests: add unit tests for core_mbox r=benpicco a=maribu

### Contribution description

As the title says

### Testing procedure

The test cases are run on `native` by Murdock anyway.

### Issues/PRs references

Split out of #18949

19030: tests/periph_timer_short_relative_set: improve test r=benpicco a=maribu

### Contribution description

Reduce the number lines to output by only testing for intervals 0..15 to speed up the test.

In addition, run each test case 128 repetitions (it is still faster than before) to give some confidence the short relative set actually succeeded.

### Testing procedure

The test application should consistently fail or succeed, rather than occasionally passing.

### Issues/PRs references

None

19085: makefiles/tests/tests.inc.mk: fix test/available target r=benpicco a=maribu

### Contribution description

`dist/tools/compile_and_test_for_board/compile_and_test_for_board.py` relies on `make test/available` to check if a test if available. However, this so far did not take `TEST_ON_CI_BLACKLIST` and `TEST_ON_CI_WHITELIST` into account, resulting in tests being executed for boards which they are not available. This should fix the issue.

### Testing procedure


#### Expected to fail

```
$ make BOARD=nrf52840dk -C tests/gcoap_fileserver test/available
$ make BOARD=microbit -C tests/log_color test/available
```

(On `master`, they succeed, but fail in this PR.)

#### Expected to succeed

```
$ make BOARD=native -C tests/gcoap_fileserver test/available
$ make BOARD=nrf52840dk -C tests/pkg_edhoc_c test/available
$ make BOARD=nrf52840dk -C tests/log_color test/available
```

(Succeed in both `master` and this PR.)

### Issues/PRs references

None

Co-authored-by: Marian Buschsieweke <marian.buschsieweke@ovgu.de>
@bors
Copy link
Contributor

bors bot commented Jan 3, 2023

Build failed (retrying...):

@bors
Copy link
Contributor

bors bot commented Jan 3, 2023

Canceled.

@maribu
Copy link
Member Author

maribu commented Jan 3, 2023

bors retry

@bors
Copy link
Contributor

bors bot commented Jan 3, 2023

🕐 Waiting for PR status (GitHub check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set.

@benpicco
Copy link
Contributor

benpicco commented Jan 3, 2023

bors merge

@bors
Copy link
Contributor

bors bot commented Jan 3, 2023

🕐 Waiting for PR status (GitHub check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set.

If the regular context switches are triggered too fast, slow MCUs
will be able to spent little time on actually progressing in the
test. This will scale the IRQ rate with the CPU clock as a crude way
too keep load within limits.
@kaspar030
Copy link
Contributor

bors merge

bors bot added a commit that referenced this pull request Jan 4, 2023
18632: tests/thread_float: do not overload slow MCUs with IRQs r=kaspar030 a=maribu

### Contribution description

If the regular context switches are triggered too fast, slow MCUs will be able to spent little time on actually progressing in the test. This will scale the IRQ rate with the CPU clock as a crude way too keep load within limits.

### Testing procedure

The unit test should now pass on the Microduino CoreRF

```
$ make BOARD=microduino-corerf AVRDUDE_PROGRAMMER=dragon_jtag -C tests/thread_float flash test
make: Entering directory '/home/maribu/Repos/software/RIOT/tests/thread_float'
Building application "tests_thread_float" for "microduino-corerf" with MCU "atmega128rfa1".
[...]
   text	  data	   bss	   dec	   hex	filename
  12834	   520	  3003	 16357	  3fe5	/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.elf
avrdude -c dragon_jtag -p m128rfa1  -U flash:w:/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.hex
[...]
Welcome to pyterm!
Type '/exit' to exit.
READY
s
START
main(): This is RIOT! (Version: 2022.10-devel-858-g18566-tests/thread_float)
THREADS CREATED

Context switch every 3125 µs
{ "threads": [{ "name": "idle", "stack_size": 192, "stack_used": 88 }]}
{ "threads": [{ "name": "main", "stack_size": 640, "stack_used": 220 }]}
THREAD t1 start
THREAD t2 start
THREAD t3 start
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770

make: Leaving directory '/home/maribu/Repos/software/RIOT/tests/thread_float'
```

(~~Note: The idle thread exiting is something that should never occur. I guess the culprit may be `cpu_switch_context_exit()` messing things up when the main thread exits. But that is not directly related to what this PR aims to fix. Adding a `thread_sleep()` at the end of `main()` does indeed prevent the idle thread from exiting.~~
Update: That's expected. The idle thread stats are printed on exit of the main thread, the idle thread does not actually exit.)

### Issues/PRs references

Fixes #16908 maybe?

19031: cpu/stm32/periph_timer: implement timer_set() r=benpicco a=maribu

### Contribution description

The fallback implementation of timer_set() in `drivers/periph_common` is known to fail on short relative sets. This adds a robust implementation.

### Testing procedure

Run `tests/periph_timer_short_relative_set` at least a few dozen times (or use #19030 to have a few dozen repetitions of the test case in a single run of the test application). It should now succeed.

### Issues/PRs references

None

Co-authored-by: Marian Buschsieweke <marian.buschsieweke@ovgu.de>
@bors
Copy link
Contributor

bors bot commented Jan 4, 2023

Build failed (retrying...):

@bors
Copy link
Contributor

bors bot commented Jan 4, 2023

Build succeeded:

@bors bors bot merged commit 9a45f4b into RIOT-OS:master Jan 4, 2023
@maribu maribu deleted the tests/thread_float branch January 4, 2023 19:29
@maribu
Copy link
Member Author

maribu commented Jan 4, 2023

Thx :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: tests Area: tests and testing framework CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tests/thread_float: crashes on avr-rss2
4 participants