Skip to content

Commit

Permalink
timers/migration: Fix endless timer requeue after idle interrupts
Browse files Browse the repository at this point in the history
When a CPU is an idle migrator, but another CPU wakes up before it,
becomes an active migrator and handles the queue, the initial idle
migrator may end up endlessly reprogramming its clockevent, chasing ghost
timers forever such as in the following scenario:

               [GRP0:0]
             migrator = 0
             active   = 0
             nextevt  = T1
              /         \
             0           1
          active        idle (T1)

0) CPU 1 is idle and has a timer queued (T1), CPU 0 is active and is
the active migrator.

               [GRP0:0]
             migrator = NONE
             active   = NONE
             nextevt  = T1
              /         \
             0           1
          idle        idle (T1)
          wakeup = T1

1) CPU 0 is now idle and is therefore the idle migrator. It has
programmed its next timer interrupt to handle T1.

                [GRP0:0]
             migrator = 1
             active   = 1
             nextevt  = KTIME_MAX
              /         \
             0           1
          idle        active
          wakeup = T1

2) CPU 1 has woken up, it is now active and it has just handled its own
timer T1.

3) CPU 0 gets a timer interrupt to handle T1 but tmigr_handle_remote()
realize it is not the migrator anymore. So it early returns without
observing that T1 has been expired already and therefore without
updating its ->wakeup value.

4) CPU 0 goes into tmigr_cpu_new_timer() which also early returns
because it doesn't queue a timer of its own. So ->wakeup is left
unchanged and the next timer is programmed to fire now.

5) goto 3) forever

This results in timer interrupt storms in idle and also in nohz_full (as
observed in rcutorture's TREE07 scenario).

Fix this with forcing a re-evaluation of tmc->wakeup while trying
remote timer handling when the CPU isn't the migrator anymmore. The
check is inherently racy but in the worst case the CPU just races setting
the KTIME_MAX value that a remote expiry also tries to set.

Fixes: 7ee9887 ("timers: Implement the hierarchical pull model")
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240318230729.15497-2-frederic@kernel.org
  • Loading branch information
Frederic Weisbecker authored and KAGA-KOKO committed Mar 19, 2024
1 parent 4b6f4c5 commit f55acb1
Showing 1 changed file with 9 additions and 2 deletions.
11 changes: 9 additions & 2 deletions kernel/time/timer_migration.c
Original file line number Diff line number Diff line change
Expand Up @@ -1038,8 +1038,15 @@ void tmigr_handle_remote(void)
* in tmigr_handle_remote_up() anyway. Keep this check to speed up the
* return when nothing has to be done.
*/
if (!tmigr_check_migrator(tmc->tmgroup, tmc->childmask))
return;
if (!tmigr_check_migrator(tmc->tmgroup, tmc->childmask)) {
/*
* If this CPU was an idle migrator, make sure to clear its wakeup
* value so it won't chase timers that have already expired elsewhere.
* This avoids endless requeue from tmigr_new_timer().
*/
if (READ_ONCE(tmc->wakeup) == KTIME_MAX)
return;
}

data.now = get_jiffies_update(&data.basej);

Expand Down

0 comments on commit f55acb1

Please sign in to comment.