Description
In the Linux kernel, the following vulnerability has been resolved: sched/rt: Skip currently executing CPU in rto_next_cpu() CPU0 becomes overloaded when hosting a CPU-bound RT task, a non-CPU-bound RT task, and a CFS task stuck in kernel space. When other CPUs switch from RT to non-RT tasks, RT load balancing (LB) is triggered; with HAVE_RT_PUSH_IPI enabled, they send IPIs to CPU0 to drive the execution of rto_push_irq_work_func. During push_rt_task on CPU0, if next_task->prio < rq->donor->prio, resched_curr() sets NEED_RESCHED and after the push operation completes, CPU0 calls rto_next_cpu(). Since only CPU0 is overloaded in this scenario, rto_next_cpu() should ideally return -1 (no further IPI needed). However, multiple CPUs invoking tell_cpu_to_push() during LB increments rd->rto_loop_next. Even when rd->rto_cpu is set to -1, the mismatch between rd->rto_loop and rd->rto_loop_next forces rto_next_cpu() to restart its search from -1. With CPU0 remaining overloaded (satisfying rt_nr_migratory && rt_nr_total > 1), it gets reselected, causing CPU0 to queue irq_work to itself and send self-IPIs repeatedly. As long as CPU0 stays overloaded and other CPUs run pull_rt_tasks(), it falls into an infinite self-IPI loop, which triggers a CPU hardlockup due to continuous self-interrupts. The trigging scenario is as follows: cpu0 cpu1 cpu2 pull_rt_task tell_cpu_to_push <------------irq_work_queue_on rto_push_irq_work_func push_rt_task resched_curr(rq) pull_rt_task rto_next_cpu tell_cpu_to_push <-------------------------- atomic_inc(rto_loop_next) rd->rto_loop != next rto_next_cpu irq_work_queue_on rto_push_irq_work_func Fix redundant self-IPI by filtering the initiating CPU in rto_next_cpu(). This solution has been verified to effectively eliminate spurious self-IPIs and prevent CPU hardlockup scenarios.
Product status
4bdced5c9a2922521e325896a7bbbf0132c94e56 (git) before d57d0746276a88ea43a2cc62b849fd8a95e32e41
4bdced5c9a2922521e325896a7bbbf0132c94e56 (git) before 3b3c672a66db3de3b40f8a7057864bc1f874ede3
4bdced5c9a2922521e325896a7bbbf0132c94e56 (git) before 16ca9f3117e9a294646c897daf08a5ab546c711b
4bdced5c9a2922521e325896a7bbbf0132c94e56 (git) before 8ad5577b2d4acfd83f03d97a0aece2d18aac5f07
4bdced5c9a2922521e325896a7bbbf0132c94e56 (git) before a6a73403733e86748421f2eeaf028c85683ef896
4bdced5c9a2922521e325896a7bbbf0132c94e56 (git) before 52aeb1e07ec223caf212f036817976c98d2aa250
4bdced5c9a2922521e325896a7bbbf0132c94e56 (git) before 9f25edc5a20cb52a5abbf25f0724bb4732b81801
4bdced5c9a2922521e325896a7bbbf0132c94e56 (git) before 94894c9c477e53bcea052e075c53f89df3d2a33e
cb1831a83e54cd3269a2420fce81c4fd8ae6f667 (git)
1c37ff78298a6b6063649123356a312e1cce12ca (git)
f17c786b28a3060a566a170c2cf3bd7441fc30a3 (git)
4.4.103 (semver) before 4.5
4.9.66 (semver) before 4.10
4.14.3 (semver) before 4.15
4.15
Any version before 4.15
5.10.252 (semver)
5.15.202 (semver)
6.1.165 (semver)
6.6.128 (semver)
6.12.75 (semver)
6.18.14 (semver)
6.19.4 (semver)
7.0 (original_commit_for_fix)
References
git.kernel.org/...c/d57d0746276a88ea43a2cc62b849fd8a95e32e41
git.kernel.org/...c/3b3c672a66db3de3b40f8a7057864bc1f874ede3
git.kernel.org/...c/16ca9f3117e9a294646c897daf08a5ab546c711b
git.kernel.org/...c/8ad5577b2d4acfd83f03d97a0aece2d18aac5f07
git.kernel.org/...c/a6a73403733e86748421f2eeaf028c85683ef896
git.kernel.org/...c/52aeb1e07ec223caf212f036817976c98d2aa250
git.kernel.org/...c/9f25edc5a20cb52a5abbf25f0724bb4732b81801
git.kernel.org/...c/94894c9c477e53bcea052e075c53f89df3d2a33e