Description
In the Linux kernel, the following vulnerability has been resolved: ring-buffer: Fix deadloop issue on reading trace_pipe Soft lockup occurs when reading file 'trace_pipe': watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [cat:4488] [...] RIP: 0010:ring_buffer_empty_cpu+0xed/0x170 RSP: 0018:ffff88810dd6fc48 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000246 RCX: ffffffff93d1aaeb RDX: ffff88810a280040 RSI: 0000000000000008 RDI: ffff88811164b218 RBP: ffff88811164b218 R08: 0000000000000000 R09: ffff88815156600f R10: ffffed102a2acc01 R11: 0000000000000001 R12: 0000000051651901 R13: 0000000000000000 R14: ffff888115e49500 R15: 0000000000000000 [...] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8d853c2000 CR3: 000000010dcd8000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __find_next_entry+0x1a8/0x4b0 ? peek_next_entry+0x250/0x250 ? down_write+0xa5/0x120 ? down_write_killable+0x130/0x130 trace_find_next_entry_inc+0x3b/0x1d0 tracing_read_pipe+0x423/0xae0 ? tracing_splice_read_pipe+0xcb0/0xcb0 vfs_read+0x16b/0x490 ksys_read+0x105/0x210 ? __ia32_sys_pwrite64+0x200/0x200 ? switch_fpu_return+0x108/0x220 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x61/0xc6 Through the vmcore, I found it's because in tracing_read_pipe(), ring_buffer_empty_cpu() found some buffer is not empty but then it cannot read anything due to "rb_num_of_entries() == 0" always true, Then it infinitely loop the procedure due to user buffer not been filled, see following code path: tracing_read_pipe() { ... ... waitagain: tracing_wait_pipe() // 1. find non-empty buffer here trace_find_next_entry_inc() // 2. loop here try to find an entry __find_next_entry() ring_buffer_empty_cpu(); // 3. find non-empty buffer peek_next_entry() // 4. but peek always return NULL ring_buffer_peek() rb_buffer_peek() rb_get_reader_page() // 5. because rb_num_of_entries() == 0 always true here // then return NULL // 6. user buffer not been filled so goto 'waitgain' // and eventually leads to an deadloop in kernel!!! } By some analyzing, I found that when resetting ringbuffer, the 'entries' of its pages are not all cleared (see rb_reset_cpu()). Then when reducing the ringbuffer, and if some reduced pages exist dirty 'entries' data, they will be added into 'cpu_buffer->overrun' (see rb_remove_pages()), which cause wrong 'overrun' count and eventually cause the deadloop issue. To fix it, we need to clear every pages in rb_reset_cpu().
Product status
a5fb833172eca69136e9ee1ada778e404086ab8a before 0a29dae5786d263016a9aceb1e56bf3fd4cc6fa0
a5fb833172eca69136e9ee1ada778e404086ab8a before a55e8a3596048c2f7b574049aeb1885b5abba1cc
a5fb833172eca69136e9ee1ada778e404086ab8a before e84829522fc72bb43556b31575731de0440ac0dd
a5fb833172eca69136e9ee1ada778e404086ab8a before 5e68f1f3a20fe9b6bde018e353269fbfa289609c
a5fb833172eca69136e9ee1ada778e404086ab8a before bb14a93bccc92766b1d9302c6bcbea17d4bce306
a5fb833172eca69136e9ee1ada778e404086ab8a before 8b0b63fdac6b70a45614e7d4b30e5bbb93deb007
a5fb833172eca69136e9ee1ada778e404086ab8a before 27bdd93e44cc28dd9b94893fae146b83d4f5b31e
a5fb833172eca69136e9ee1ada778e404086ab8a before 7e42907f3a7b4ce3a2d1757f6d78336984daf8f5
3.6
Any version before 3.6
4.14.322
4.19.291
5.4.251
5.10.188
5.15.121
6.1.40
6.4.5
6.5
References
git.kernel.org/...c/0a29dae5786d263016a9aceb1e56bf3fd4cc6fa0
git.kernel.org/...c/a55e8a3596048c2f7b574049aeb1885b5abba1cc
git.kernel.org/...c/e84829522fc72bb43556b31575731de0440ac0dd
git.kernel.org/...c/5e68f1f3a20fe9b6bde018e353269fbfa289609c
git.kernel.org/...c/bb14a93bccc92766b1d9302c6bcbea17d4bce306
git.kernel.org/...c/8b0b63fdac6b70a45614e7d4b30e5bbb93deb007
git.kernel.org/...c/27bdd93e44cc28dd9b94893fae146b83d4f5b31e
git.kernel.org/...c/7e42907f3a7b4ce3a2d1757f6d78336984daf8f5