Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys/wait: add module for backend agnostic waits #19719

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

maribu
Copy link
Member

@maribu maribu commented Jun 8, 2023

Contribution description

This adds to sets of delay functions. The first kind consists of

  • wait_at_least_us(): delay execution in units of microseconds
  • wait_at_least_ms(): delay execution in units of milliseconds
  • wait_at_least_s(): delay execution in units of seconds

These will use ztimer as backend, if used anyways. Otherwise a trivial CPU counting loop is used.

The second kind consists of

  • wait_us_spinning(): busy-wait for a given number of microseconds

The benefit of this compared to ztimer_spin() is that for AVR a platform specific implementation is used that is more accurate. It allows writing drivers like the DHT11/DHT2x driver without platform specific code. Other low end platforms can follow suit and also add more accurate waiting code.

Testing procedure

A test application has been added with a python script doing automatic testing. In addition, that test app performs short pulses on the LED that should be visible via a scope / logic analyzer. Finally, it will estimate the CPU cycles required per loop iteration that can be used to fine-tune the loop.

Issues/PRs references

Deprecates: #17330

Useful for: #19718

@maribu maribu added the State: WIP State: The PR is still work-in-progress and its code is not in its final presentable form yet label Jun 8, 2023
@github-actions github-actions bot added Area: cpu Area: CPU/MCU ports Area: sys Area: System Area: timers Area: timer subsystems Platform: ARM Platform: This PR/issue effects ARM-based platforms Platform: AVR Platform: This PR/issue effects AVR-based platforms Platform: ESP Platform: This PR/issue effects ESP-based platforms Platform: MSP Platform: This PR/issue effects MSP-based platforms Platform: RISC-V Platform: This PR/issue effects RISC-V-based platforms labels Jun 8, 2023
@kaspar030
Copy link
Contributor

Again, I don't think that delay_us() should fall back to msec.

@maribu
Copy link
Member Author

maribu commented Jun 8, 2023

I dropped delay_us()

@kaspar030
Copy link
Contributor

I dropped delay_us()

I think that would be useful. Can't it just use ZTIMER_USEC?

"Those are trivial wrappers over ztimer_sleep() that just use the most appropriate clock that is used anyway. It gives up control over the most suitable trade-off between resolution, accuracy (clock drift) and power consumption for some flexibility / ROM size."

Repeating some arguments from #17330 ...

Didn't If they are "trivial" wrappers, make them use the corresponding clock. delay_us(N) -> ztimer_sleep(ZTIMER_USEC, N) etc.

Making this depending on which module is already used is awful, as adding modules to an already tested part of an application might add more timer modules, making the behavior change. That is especially bad for delay_s(), which might more often be used in human scale small numbers.

Like, "delay_s(2)" is +0..2seconds when using ZTIMER_SEC, +0..2 milliseconds when using ZTIMER_MSEC.

So for the timer-based delay functions, why not just:

delay_us(N) -> ztimer_sleep(ZTIMER_USEC, N)
delay_ms(N) -> ztimer_sleep(ZTIMER_MSEC, N)
delay_s(N) -> _ztimer_sleep_scale_up(ZTIMER_MSEC, secs, MS_PER_SEC) (as ZTIMER_SEC is usually implemented using ZTIMER_MSEC anyways)

Alternatively, document clearly that the delays are +0..2, and where that matters, the higher resolution delay should be used.

But please, don't make potentially hand-tuned timings change just because adding USEMODULE += some_other_module pulls in ztimer_msec.

*/
static inline void delay_ms(uint32_t msecs)
{
if (IS_USED(MODULE_ZTIMER_MSEC)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (IS_USED(MODULE_ZTIMER_MSEC)) {
if (IS_USED(MODULE_ZTIMER_MSEC) && msec > 10) {

jitter < 10%

Copy link
Member Author

@maribu maribu Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would trade ROM for increased accuracy. But the whole point of this API is to trade in accuracy to safe ROM / dependencies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • ztimer_msec is not able to sleep for 1 ms ( a value of 1 is 0 usec to 2000 usec).
  • there will not be any ROM usage for compiletime constant delays i would such a functions be used for (static inline)

*/
static inline void delay_s(uint32_t secs)
{
if (IS_USED(MODULE_ZTIMER_SEC)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (IS_USED(MODULE_ZTIMER_SEC)) {
if (IS_USED(MODULE_ZTIMER_SEC) && secs > 10) {

sys/include/ztimer.h Outdated Show resolved Hide resolved
Comment on lines 12 to 20
*
* # Motivation
*
* @ref sys_ztimer already provides @ref ztimer_sleep to allow delaying the
* execution of the running thread. While @ref ztimer_sleep allows via the
* selection of the clock to select the best fitting trade-off between
* resolution, accuracy (clock drift), power consumption and so on, it is
* difficult to write code with flexible requirements that just picks whatever
* clock is available.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a paragraph on top here about what this module is supposed to do instead of having to derive it of what ztimer is not able to do. Then maybe include also a bit on when to pick this over ztimer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@kaspar030
Copy link
Contributor

 * For use cases such as waiting 1 ms for a device to reset during start up it
 * matters little if the delay ends up being 1 ms, 2 ms or even 10 ms. 

What's wrong with just using ZTIMER_MSEC for that?

@maribu
Copy link
Member Author

maribu commented Jun 9, 2023

falling back to ZTIMER_USEC is not acceptable

Why?

To be honest, I would be in favor of falling back to ZTIMER_USEC. But I'd rather have this in without falling back to ZTIMER_USEC then having PR going stale. (You could do a follow up PR, though 😉)

Comment on lines 154 to 183
if (IS_USED(MODULE_ZTIMER_MSEC)) {
ztimer_sleep(ZTIMER_MSEC, msecs);
}
else {
uint32_t iterations = msecs * (uint64_t)coreclk() / MS_PER_SEC + 1;
wait_busy_loop(iterations);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that's better than a fallback to ZTIMER_USEC?
And why no wait_at_least_us()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that's better than a fallback to ZTIMER_USEC?

No, quite the opposite: I think falling back to ZTIMER_USEC is indeed better before falling back to the busy loop.

But I'd rather not have this PR being deadlocked. Especially since the DHT driver could profit from this getting upstream soonish.

Let's discuss this in the next sprint meeting. Maybe some agreement can be reached.

And why no wait_at_least_us()?

That would not be allowed to fall back to ZTIMER_MSEC. But now that there is a CPU loop to fall back to, it would be indeed more than an alias for ztimer_sleep(ZTIMER_USEC, ...). I'll add this again.

@github-actions github-actions bot added the Area: examples Area: Example Applications label Jun 9, 2023
@maribu maribu added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed State: WIP State: The PR is still work-in-progress and its code is not in its final presentable form yet labels Jun 9, 2023
@maribu maribu changed the title sys/delay: Add new convenience module sys/wait: add module for backend agnostic waits Jun 9, 2023
@riot-ci
Copy link

riot-ci commented Jun 9, 2023

Murdock results

FAILED

225c5dd examples/blinky: make use of sys/wait

Success Failures Total Runtime
5551 0 6947 08m:18s

Artifacts

@github-actions github-actions bot added the Platform: native Platform: This PR/issue effects the native platform label Jun 9, 2023
@benpicco
Copy link
Contributor

benpicco commented Jun 12, 2023

I still don't see any argument against using ZTIMER_USEC, where was that brought up?
Maybe we should just add the spinning fallback to xtimer_{u,m}sleep() and use those through the compat API wrapper where we retcon xtimer to mean any timer.

* @param[in] scale Each tick in @p time corresponds to @p scale
* ticks of @p clock
*/
void ztimer_sleep_scale_up(ztimer_clock_t *clock, uint32_t time, uint32_t scale);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is duplicated

@kaspar030
Copy link
Contributor

@maribu Maybe splilt out the more accurate arch specific spinning?

The function is useful for other modules as well. It is however marked
as internal, so that application developers do not expect it to be
a stable, public API.
@maribu
Copy link
Member Author

maribu commented Jun 12, 2023

@maribu Maybe splilt out the more accurate arch specific spinning?

No need anymore to put that on a fast track. The DHT driver rewrite uses a different approach now that no longer requires ztimer_spin() anyway and also works on AVR as well.

And since there is no need to rush anyway, I'll re-add falling back to a different ztimer backend.

This adds to sets of delay functions. The first kind consists of

- `wait_us()`: delay execution in units of microseconds
- `wait_ms()`: delay execution in units of milliseconds
- `wait_s()`: delay execution in units of seconds

These will use `ztimer` as backend, if used anyways. Otherwise a
trivial CPU counting loop is used.

The second kind consists of

- `wait_us_spinning()`: busy-wait for a given number of microseconds

The benefit of this compared to `ztimer_spin()` is that platform
specific implementation can be used to increase accuracy. This can
allow timing critical drivers (e.g. bit-banging with accuracy
requirements with in the order of a few microseconds) to scale down
to slow MCUs.
Comment on lines +28 to +39
#ifdef __cplusplus
extern "C" {
#endif
#ifndef DOXYGEN

/* ARM7 uses ztimer_spin for busy wait */

#endif

#ifdef __cplusplus
}
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to provide this dummy file if you use

#if __has_include("wait_arch.h")
#include "wait_arch.h"
#endif 

* a two instruction delay loop indeed takes only one cycle per
* iteration once the dynamic branch prediction converges.
*/
#define CPU_CYCLES_PER_WAIT_LOOP_ITERATION 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't used anywhere

Comment on lines +152 to +154
if (IS_USED(MODULE_ZTIMER_USEC)) {
ztimer_sleep(ZTIMER_USEC, usecs);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

Suggested change
if (IS_USED(MODULE_ZTIMER_USEC)) {
ztimer_sleep(ZTIMER_USEC, usecs);
}
if (IS_USED(MODULE_ZTIMER_USEC)) {
ztimer_sleep(ZTIMER_USEC, usecs);
} else if (IS_USED(MODULE_ZTIMER_MSEC)) {
ztimer_sleep(ZTIMER_MSEC, DIV_ROUND_UP(usecs, US_PER_MS);
}

Comment on lines +152 to +154
if (IS_USED(MODULE_ZTIMER_USEC)) {
ztimer_sleep(ZTIMER_USEC, usecs);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

Suggested change
if (IS_USED(MODULE_ZTIMER_USEC)) {
ztimer_sleep(ZTIMER_USEC, usecs);
}
if (IS_USED(MODULE_ZTIMER_USEC)) {
ztimer_sleep(ZTIMER_USEC, usecs);
} else if (IS_USED(MODULE_ZTIMER_MSEC)) {
ztimer_sleep(ZTIMER_MSEC, DIV_ROUND_UP(usecs, US_PER_MS);
}

@kaspar030
Copy link
Contributor

I still don't see any argument against using ZTIMER_USEC, where was that brought up?
Maybe we should just add the spinning fallback to xtimer_{u,m}sleep() and use those through the compat API wrapper where we retcon _x_timer to mean any timer.

So my main gripe with this PR is the vagueness of the API.

Our "delay the current process for a while" API is ztimer. It allows arbitrary backends to be chosen. There are a couple of backends that the build system tries hard to find an implementation for (usec, msec, sec). There are multiple options for each backend (msec -> rtt, msec->usec, ...). ztimer is quantized on the clock ticks, and provides at least semantics with +0-2 guarantees. It's unsuitable for very short sleeps (<5us - <100us, depending on hardware) due to being interrupt based (meaning a context switch is always involved).

So ztimer provides the functionality that this API wraps.
For "wait at least N ms (N >2)", just use ztimer_msec() and be done. If the application is interested in low power, the user will have configured ztimer_msec to use a low-power clock.
For "wait at least N us (with N > 5), just use ztimer_usec and be done with it. Won't be low power.

"but now there's a ztimer dependency". Well. Let's make sure to get one ztimer clock that suits most needs (like, ztimer_msec), and use that as much as possible. Chances are, any sufficiently large application will use ztimer_msec somewhere. Any sufficiently minimal application won't feel the ram/rom cost.

So maybe typing ztimer_sleep(ZTIMER_MSEC, N) gets old, so having delay/wait/sleep_ms(N) *mapping to ztimer_sleep(ZTIMER_MSEC, N) does make some sense.

Why not chose any available backend? Because the call site suddenly doesn't clearly state what the function call is doing. We're basically introducing ztimer_sleep_or_maybe_spin_maybe_lowpower_with_maybe_+0..2_us_or_maybe_+0..2_ms_accuracy().
The API has uint32_t value range, so a call to wait_ms() might spin for 49 days. Or just set the thread to sleep and by extension the whole MCU. Depending on the dependency tree.

What way of "waiting" is actually used is already quite muddy when using ZTIMER_M/USEC, as there's already complex heuristics and configuration at play. This layer of abstraction adds even more muddyness. The tree of used waiting mechanisms (and watering of guarantees) grows to a forest.

For the use case most stated ("driver needs to wait a bit for the device to be ready, a bit longer doesn't matter"), if spinning is acceptable, that must be assumed to actually be used in any case, because if spinning would not be acceptable, use of wait_ms() in its current form is a bug. Because depending on the dependencies, spinning might happen. So we must be able to replace all ẁait_ms() with spin_ms() without breakage (or unacceptable behavior), because that happens implicitly if spin_ms() is one of the fallback methods. But if spin_ms() is "acceptable behavior", why not just use that? -> The doxygen states "if the application developer enabled a low power clock anyway, this can be taken as a hint that low power sleeps are preferable". That is already being dealt with at ztimer_msec level.
The same argumentation goes with low power or accuracy.
the worst incarnation of all possible outcomes (of what wait_ms...() actually calls) must be acceptable everywhere that function is used, and must be tested. (This is a high level version of "possible UB makes a case impossible so let's drop this code").

Is ztimer too fat still (implementation wise)? Is that a sentiment, or an actual issue? Because RAM/ROM use would be the only reason for not just using ztimer sleeps.

My suggestion here is:

  • provide delay_ms/us(). Map directly to the corresponding ztimer clock. Document that it quantizes (delay_ms/us(N) starts in the middle of a clock tick) and is generally +0..2 ticks.
  • provide "spin_us(uint8_t us)" for bit-banging needs (using this PR's per-CPU implementations)
  • replace the ztimer sleeps system wide with the corresponding delay calls
  • create a nice overview page that answers the question on which sleep function to use with delay_ms(N).
  • overall, simplify the amount of user-facing APIs in place, and the amount of dependency tree dependent behaviors
  • accept doing USEMODULE += delay_ms() even for simple drivers that need it
  • accept doing spin_us() in simple driver's boot up one time initialization

Otherwise, prove that using ztimer direct wrapping (with no fallbacks) is causing too many RAM/ROM issues in places where the delay_ms() couldn't be replaced with spin_us() without causing otherwise unacceptable behavior.

@kaspar030
Copy link
Contributor

Because RAM/ROM use would be the only reason for not just using ztimer sleeps.

We had some productive discussion at today's Friday meeting. Karl brought up two more aspects:

  • while ztimer has "at least" semantics, it actually also advertizes "+-1 tick systemic inaccuracy". Meaning, to get actual "at least", the user needs to add one to the desired period.

  • also, the "low resource AVR" use case was discussed, where the MCU might not have the means to pull in any ztimer
    This could be mitigated by a simplified alternative spinning ztimer implementation:

enum {
  ZTIMER_USEC,
  ZTIMER_MSEC,
  ZTIMER_SEC
};

ztimer_sleep(clock, ticks) {
  switch clock {
     ZTIMER_USEC: {
       spin_us(ticks)}
       break;
       ZTIMER_MSEC: {
         for (unsigned i = 0; i < 1000, i++) {
           ztimer_sleep(ZTIMER_USEC, ticks);
           }
           break;
    ...
    }
}

(or sth smarter)

@kfessel
Copy link
Contributor

kfessel commented Jun 16, 2023

A spinning implementation in ztimer_sleep isn't helpfull for the read ablity of ztimer_sleep, which should just wrap a ztimer_set and required data (a ztimer_t and a mutex) to make it work (we should a void having any alternative there)

that is the reason to put it somewhere else, keep ztimer_sleep simple

not having an implicit spin is one of the big advancements of ztimer over xtimer (the other one is not having a absolute time interface both these features keep ztimer simple and able to perform ist power saving)

static inline void wait_at_least_ms(uint16_t msecs)
{
if (IS_USED(MODULE_ZTIMER_MSEC)) {
ztimer_sleep(ZTIMER_MSEC, msecs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to add 1 for at least semantic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? There is no division here.

static inline void wait_at_least_us(uint16_t usecs)
{
if (IS_USED(MODULE_ZTIMER_USEC)) {
ztimer_sleep(ZTIMER_USEC, usecs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add 1 for at least semantic

}
else {
uint32_t iterations = msecs * (uint64_t)coreclk() / MS_PER_SEC;
wait_busy_loop(iterations);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wait_busy_loop(iterations);
busy_wait_us((uint32_t)msecs * US_PER_MS);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: build system Area: Build system Area: cpu Area: CPU/MCU ports Area: examples Area: Example Applications Area: sys Area: System Area: tests Area: tests and testing framework Area: timers Area: timer subsystems CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Platform: ARM Platform: This PR/issue effects ARM-based platforms Platform: AVR Platform: This PR/issue effects AVR-based platforms Platform: ESP Platform: This PR/issue effects ESP-based platforms Platform: MSP Platform: This PR/issue effects MSP-based platforms Platform: native Platform: This PR/issue effects the native platform Platform: RISC-V Platform: This PR/issue effects RISC-V-based platforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants