Skip to content

Conversation

@wangchdo
Copy link
Contributor

@wangchdo wangchdo commented Oct 21, 2025

This PR includes three commits:

commit 1:
In the current scheduler implementation, nxsched_ticksleep()
sets the current task state to TSTATE_WAIT_SIG, which is not
semantically appropriate. A dedicated TSTATE_SLEEPING state
is more suitable for this purpose.

Additionally, list_sleepingtasks() is introduced to replace
list_waitingforsignal() in the nxsched_ticksleep() implementation,
to clearly distinguish tasks sleeping for timeout from those
waiting for signals.

commit 2:
Add a new function nxsched_wakeup() to the scheduler,
which allows waking up a sleeping task before its timeout.

commit 3:
Refactor nxsched_timeout() to use nxsched_wakeup() in order to
eliminate code duplication and improve maintainability

Summary

Improved implementation of nxsched_ticksleep() and introduced new function nxsched_wakeup()

Impact

Improvement of newly added function nxsched_ticksleep() and introduced a new function nxsched_wakeup()

No impact to other nuttx functions

Testing

PR 17204 has already replace all Signal-based sleep implement to Scheduled sleep, So CI will verify this PR's improvement of nxsched_ticksleep() is OK

also,

on the one hand:
sleep related api test were added on sim/nsh, sabre-6quad/smp: nxsched_sleep() / nxsched_usleep() / nxsched_msleep() / nxsched_ticksleep()

on the other hand:
sched/event implementation has been modified to use sleep/wakeup pair, and the ostest including event test cases passed on board a2g-tc397-5v-tft:

image

@github-actions github-actions bot added Area: OS Components OS Components issues Size: M The size of the change in this PR is medium labels Oct 21, 2025
@linguini1
Copy link
Contributor

I think this needs to be tested with more than just CI. It is a code change that needs runtime testing, not just building.

Also, why would we want to wake up a sleeping task before its timeout? Maybe I'm unfamiliar with the semantics of sleeping, but isn't it a requirement that the caller is never woken up prior to the timeout? Only on or after?

@linguini1 linguini1 requested a review from acassis October 21, 2025 13:18
@wangchdo
Copy link
Contributor Author

wangchdo commented Oct 21, 2025

I think this needs to be tested with more than just CI. It is a code change that needs runtime testing, not just building.

Also, why would we want to wake up a sleeping task before its timeout? Maybe I'm unfamiliar with the semantics of sleeping, but isn't it a requirement that the caller is never woken up prior to the timeout? Only on or after?

Sleep and wakeup sometimes should be in pair in terms of syncronization

For example task1 can not move on because some condition is not met and choose to sleep for a while in the simplest case, but before timeout task2 which has lower priority than task1 met the condition, then it can wakeup task1 to let task1 preempt it

sleep/wakeup is simpler and lighter than mutex, semaphore or event in this case

@wangchdo
Copy link
Contributor Author

I think this needs to be tested with more than just CI. It is a code change that needs runtime testing, not just building.

Also, why would we want to wake up a sleeping task before its timeout? Maybe I'm unfamiliar with the semantics of sleeping, but isn't it a requirement that the caller is never woken up prior to the timeout? Only on or after?

I will add test it and put the results in this PR later

@linguini1
Copy link
Contributor

linguini1 commented Oct 21, 2025

Sleep and wakeup sometimes should be in pair in terms of syncronization

For example task1 can not move on because some condition is not met and choose to sleep for a while in the simplest case, but before timeout task2 which has lower priority than task1 met the condition, then it can wakeup task1 to let task1 preempt it

sleep/wakeup is simpler and lighter than mutex, semaphore or event in this case

But this isn't what sleeping is for? If you want to wait for a condition, you block on a semaphore/mutex, not sleep (unless it's some hardware polling condition that requires sleeping).

I personally don't support the addition of the wakeup function since now all the up_delay have been replaced with this new implementation, and up_delay would never return before the timeout. We're changing the behavior all over the code. This is a sleep function, not for waiting on a condition, just sleeping. I don't see a use case this is needed either, just use a semaphore.

@linguini1
Copy link
Contributor

I personally don't support the addition of the wakeup function since now all the up_delay have been replaced with this new implementation, and up_delay would never return before the timeout. We're changing the behavior all over the code. This is a sleep function, not for waiting on a condition, just sleeping. I don't see a use case this is needed either, just use a semaphore.

I see now that even the previous implementation of nxsched_sleep could wake up the task early since it was implemented as waiting for a signal. So I do agree in that sense that this PR wouldn't necessarily change that behaviour. Although, I still think it shouldn't be possible to wake up a sleeping task before its timeout since that behaviour is not intuitive and doesn't seem correct for a delay/sleep function.

@wangchdo
Copy link
Contributor Author

Sleep and wakeup sometimes should be in pair in terms of syncronization
For example task1 can not move on because some condition is not met and choose to sleep for a while in the simplest case, but before timeout task2 which has lower priority than task1 met the condition, then it can wakeup task1 to let task1 preempt it
sleep/wakeup is simpler and lighter than mutex, semaphore or event in this case

But this isn't what sleeping is for? If you want to wait for a condition, you block on a semaphore/mutex, not sleep (unless it's some hardware pulling condition that requires sleeping).

I personally don't support the addition of the wakeup function since now all the up_delay have been replaced with this new implementation, and up_delay would never return before the timeout. We're changing the behavior all over the code. This is a sleep function, not for waiting on a condition, just sleeping. I don't see a use case this is needed either, just use a semaphore.

HI @linguini1

I used this sleep/wakeup pair to improve event implementation in #17223
could you please help to review?

@wangchdo wangchdo force-pushed the sched_sleeping_improve branch from c24f03d to aae81a3 Compare October 22, 2025 01:19
TSTATE_TASK_INACTIVE, /* BLOCKED - Initialized but not yet activated */
TSTATE_SLEEPING, /* BLOCKED - Waiting for wakeup or timeout */
TSTATE_WAIT_SEM, /* BLOCKED - Waiting for a semaphore */
TSTATE_WAIT_SIG, /* BLOCKED - Waiting for a signal */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reusing the signal list is actually a tricky implementation.

  1. It doesn't add any additional code size or new system state.
  2. Threads already sleeping via a signal are not affected, as kernel api won't enter cancellation points.
  3. Timeouts will call the corresponding timeout function, without any side effects.
    But I don't object to adding new status.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but do you think a separate list can save the api time for signal or sleep operations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Sleep/signal is a niche use case. Application design should prioritize event triggering over timeouts.
  2. This change will maintain the same performance as before, as both use signal lists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restored the implementation using signal list

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may have missed my comment. I'm not against adding a new queue, but you restored, just stick with the current base.

Copy link
Contributor Author

@wangchdo wangchdo Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think call stack is enough in this case: the running taskB want to know the reason why taskA is blocked, even though this case is hardly needed in real world.

Anyway I agree that users may just care whether the task is running or waiting, but I think kernel need to separate which object the task is waiting on so the kernel can do things differently according to this, please check the code below, I think this is the main reason nuttx now have different wait state : TSTATE_WAIT_SEM/TSTATE_WAIT_SIG/TSTATE_WAIT_MQNOTEMPTY/TSTATE_WAIT_MQNOTFULL:

BTW: I have planed to update the code below for the new task state: TSTATE_SLEEPING

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think call stack is enough in this case: the running taskB want to know the reason why taskA is blocked, even though this case is hardly needed in real world.

The backtrace will show you whether the thread is blocking in xxx_sleep or sig_xxx.

Anyway I agree that users may just care whether the task is running or waiting, but I think kernel need to separate which object the task is waiting on so the kernel can do things differently according to this,

Another problem is that each wait state has an entry in g_tasklisttable, but it's strange that the scheduler never modify TSTATE_SLEEPING list.

please check the code below, I think this is the main reason nuttx now have different wait state (I have planed to update the code below for the new task state: TSTATE_SLEEPING ): TSTATE_WAIT_SEM/TSTATE_WAIT_SIG/TSTATE_WAIT_MQNOTEMPTY/TSTATE_WAIT_MQNOTFULL:

The origin code distinguish the wait state between semaphore, queue, etc..., because the scheduler need handle the wait list in the different way. But, we can reuse the signal wait list because sleep is a special case of wait signal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @xiaoxiang781216

I updated this patch and and chose to #define TSTATE_SLEEPING TSTATE_WAIT_SIG

Do you think this is OK?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks a good idea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI check passed, Do you think this PR can be merged? @xiaoxiang781216 @anchao

    In the current scheduler implementation, nxsched_ticksleep()
    sets the current task state to TSTATE_WAIT_SIG, which is not
    semantically appropriate. A dedicated TSTATE_SLEEPING state
    is more suitable for this purpose.

Signed-off-by: Chengdong Wang <wangchengdong@lixiang.com>
  Add a new function nxsched_wakeup() to the scheduler,
  which allows waking up a sleeping task before its timeout.

Signed-off-by: Chengdong Wang wangchengdong@lixiang.com
@wangchdo wangchdo force-pushed the sched_sleeping_improve branch from aae81a3 to 8b77e97 Compare October 22, 2025 14:21
@wangchdo wangchdo requested a review from anchao October 22, 2025 15:23
anchao
anchao previously approved these changes Oct 22, 2025
    Refactor nxsched_timeout() to use nxsched_wakeup() in order to
    eliminate code duplication and improve maintainability.

Signed-off-by: Chengdong Wang wangchengdong@lixiang.com
@wangchdo wangchdo force-pushed the sched_sleeping_improve branch from 4d85777 to 17edd89 Compare October 23, 2025 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: OS Components OS Components issues Size: M The size of the change in this PR is medium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants