Skip to content

Conversation

@opsiff
Copy link
Member

@opsiff opsiff commented Dec 25, 2025

stable inclusion
from stable-v6.12.63
category: bugfix

[ Upstream commit 79104becf42baeeb4a3f2b106f954b9fc7c10a3c ]

If a task yields, the scheduler may decide to pick it again. The task in turn may decide to yield immediately or shortly after, leading to a tight loop of yields.

If there's another runnable task as this point, the deadline will be increased by the slice at each loop. This can cause the deadline to runaway pretty quickly, and subsequent elevated run delays later on as the task doesn't get picked again. The reason the scheduler can pick the same task again and again despite its deadline increasing is because it may be the only eligible task at that point.

Fix this by making the task forfeiting its remaining vruntime and pushing the deadline one slice ahead. This implements yield behavior more authentically.

We limit the forfeiting to eligible tasks. This is because core scheduling prefers running ineligible tasks rather than force idling. As such, without the condition, we can end up on a yield loop which makes the vruntime increase rapidly, leading to anomalous run delays later down the line.

Fixes: 147f3ef ("sched/fair: Implement an EEVDF-like scheduling policy")

Link: https://lore.kernel.org/r/20250401123622.584018-1-sieberf@amazon.com
Link: https://lore.kernel.org/r/20250911095113.203439-1-sieberf@amazon.com
Link: https://lore.kernel.org/r/20250916140228.452231-1-sieberf@amazon.com

(cherry picked from commit d5843e1530d8d4ebf2b34f0185d45828848f107a)

Summary by Sourcery

Bug Fixes:

  • Ensure yielding tasks forfeit remaining vruntime and advance their deadline only when eligible to avoid yield loops and subsequent long scheduling delays.

stable inclusion
from stable-v6.12.63
category: bugfix

[ Upstream commit 79104becf42baeeb4a3f2b106f954b9fc7c10a3c ]

If a task yields, the scheduler may decide to pick it again. The task in
turn may decide to yield immediately or shortly after, leading to a tight
loop of yields.

If there's another runnable task as this point, the deadline will be
increased by the slice at each loop. This can cause the deadline to runaway
pretty quickly, and subsequent elevated run delays later on as the task
doesn't get picked again. The reason the scheduler can pick the same task
again and again despite its deadline increasing is because it may be the
only eligible task at that point.

Fix this by making the task forfeiting its remaining vruntime and pushing
the deadline one slice ahead. This implements yield behavior more
authentically.

We limit the forfeiting to eligible tasks. This is because core scheduling
prefers running ineligible tasks rather than force idling. As such, without
the condition, we can end up on a yield loop which makes the vruntime
increase rapidly, leading to anomalous run delays later down the line.

Fixes: 147f3ef ("sched/fair: Implement an EEVDF-like scheduling  policy")
Signed-off-by: Fernand Sieber <sieberf@amazon.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250401123622.584018-1-sieberf@amazon.com
Link: https://lore.kernel.org/r/20250911095113.203439-1-sieberf@amazon.com
Link: https://lore.kernel.org/r/20250916140228.452231-1-sieberf@amazon.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit d5843e1530d8d4ebf2b34f0185d45828848f107a)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
@sourcery-ai
Copy link

sourcery-ai bot commented Dec 25, 2025

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Ports the upstream fix for CFS yield behavior so that yielding tasks forfeit remaining vruntime and advance deadline only when eligible, preventing runaway deadlines and anomalous scheduling delays under EEVDF-like scheduling.

Sequence diagram for CFS yield loop prevention

sequenceDiagram
  actor Task
  participant Scheduler
  participant CFSRunQueue as CFS_run_queue
  participant SchedEntity as Sched_entity

  Task->>Scheduler: sys_sched_yield
  Scheduler->>CFSRunQueue: lock_rq_and_skip_clock_update
  Scheduler->>SchedEntity: yield_task_fair
  SchedEntity->>CFSRunQueue: entity_eligible cfs_rq se ?
  alt entity is not eligible
    SchedEntity-->>Scheduler: no_vruntime_change
  else entity is eligible
    SchedEntity->>SchedEntity: se_vruntime = se_deadline
    SchedEntity->>SchedEntity: se_deadline += calc_delta_fair se_slice se
    SchedEntity->>CFSRunQueue: update_min_vruntime cfs_rq
    SchedEntity-->>Scheduler: vruntime_forfeited_and_deadline_advanced
  end
  Scheduler->>Scheduler: select_next_entity_to_run
  Scheduler-->>Task: may_or_may_not_pick_same_task_based_on_new_deadline
Loading

Flow diagram for updated yield_task_fair behavior

flowchart TD
  A[rq_lock_acquired_and_rq_clock_skip_update] --> B[call_yield_task_fair]
  B --> C[entity_eligible cfs_rq se ?]
  C -->|no| D[skip_vruntime_forfeit_and_deadline_update]
  D --> E[return_from_yield_task_fair]
  C -->|yes| F[set se_vruntime_to_se_deadline]
  F --> G[advance_se_deadline_by_calc_delta_fair se_slice se]
  G --> H[update_min_vruntime cfs_rq]
  H --> E[return_from_yield_task_fair]
Loading

File-Level Changes

Change Details Files
Adjust CFS yield behavior so yielding entities forfeit remaining vruntime and advance deadline only when eligible, updating min_vruntime accordingly.
  • Wrap deadline and vruntime adjustment in an entity_eligible check to avoid penalizing ineligible entities in core scheduling scenarios.
  • On eligible yield, set vruntime to the current deadline and then advance the deadline by one slice via calc_delta_fair.
  • Call update_min_vruntime after modifying vruntime/deadline to keep the runqueue’s minimum vruntime consistent.
kernel/sched/fair.c

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@deepin-ci-robot
Copy link

deepin pr auto review

我来对这个代码变更进行详细分析:

  1. 语法逻辑分析:
  • 新增的代码语法正确,符合Linux内核编码规范
  • 增加了必要的注释说明代码逻辑
  • 条件判断结构清晰,使用了entity_eligible()函数进行检查
  1. 代码质量改进:
  • 优点:
    • 增加了详细的注释,解释了修改的原因和目的
    • 添加了必要的条件检查,提高了代码的健壮性
    • 调用了update_min_vruntime()来维护运行队列的最小虚拟运行时间
  • 建议:
    • 可以考虑将entity_eligible()的检查结果记录在日志中,便于调试
  1. 性能分析:
  • 新增的条件检查会带来轻微的性能开销,但是这个开销是必要的
  • update_min_vruntime()的调用会更新运行队列的最小虚拟运行时间,这个操作的时间复杂度是O(1),性能影响较小
  • 总体来说,性能影响在可接受范围内
  1. 安全性分析:
  • 修改提高了系统在core调度场景下的稳定性
  • 防止了可能出现的死循环情况(yield任务被反复选择然后立即yield)
  • 通过条件检查确保只有eligible的任务才会更新vruntime,避免了不公平的调度
  1. 具体改进建议:
  • 可以考虑在注释中添加更多关于core调度场景的说明
  • 建议添加适当的错误处理机制,以防update_min_vruntime()失败
  • 可以考虑添加性能监控点,以便跟踪这个修改对系统性能的影响

这个修改主要解决了在core调度场景下可能出现的问题,通过添加条件检查和适当的vruntime更新机制,提高了系统的稳定性和公平性。修改是合理且必要的。

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • Consider expanding the commit message or in-code comment to clarify why skipping the deadline update entirely for ineligible entities is correct (i.e., why we don't need to advance their deadline at all on yield), since that subtlety is easy to misinterpret when reasoning about fairness.
  • It may be worth mentioning in the comment what happens to se->vruntime when entity_eligible() is false (i.e., that we intentionally keep both vruntime and deadline unchanged) to make the behavior in core-sched scenarios more explicit for future readers.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider expanding the commit message or in-code comment to clarify why skipping the deadline update entirely for ineligible entities is correct (i.e., why we don't need to advance their deadline at all on yield), since that subtlety is easy to misinterpret when reasoning about fairness.
- It may be worth mentioning in the comment what happens to `se->vruntime` when `entity_eligible()` is false (i.e., that we intentionally keep both vruntime and deadline unchanged) to make the behavior in core-sched scenarios more explicit for future readers.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@Avenger-285714
Copy link
Member

/approve

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR backports an upstream scheduler fix that prevents virtual runtime (vruntime) runaway when tasks repeatedly yield. The fix makes yielding tasks forfeit their remaining vruntime and advance their deadline, but only when the task is eligible, preventing a yield loop scenario in core scheduling configurations.

  • Adds eligibility check before vruntime forfeiture to prevent yield loops
  • Updates vruntime to deadline and advances deadline by one slice for eligible tasks
  • Updates min_vruntime after modifying the current task's vruntime

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Avenger-285714

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants