Description
In the Linux kernel, the following vulnerability has been resolved: fuse: fix livelock in synchronous file put from fuseblk workers I observed a hang when running generic/323 against a fuseblk server. This test opens a file, initiates a lot of AIO writes to that file descriptor, and closes the file descriptor before the writes complete. Unsurprisingly, the AIO exerciser threads are mostly stuck waiting for responses from the fuseblk server: # cat /proc/372265/task/372313/stack [<0>] request_wait_answer+0x1fe/0x2a0 [fuse] [<0>] __fuse_simple_request+0xd3/0x2b0 [fuse] [<0>] fuse_do_getattr+0xfc/0x1f0 [fuse] [<0>] fuse_file_read_iter+0xbe/0x1c0 [fuse] [<0>] aio_read+0x130/0x1e0 [<0>] io_submit_one+0x542/0x860 [<0>] __x64_sys_io_submit+0x98/0x1a0 [<0>] do_syscall_64+0x37/0xf0 [<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53 But the /weird/ part is that the fuseblk server threads are waiting for responses from itself: # cat /proc/372210/task/372232/stack [<0>] request_wait_answer+0x1fe/0x2a0 [fuse] [<0>] __fuse_simple_request+0xd3/0x2b0 [fuse] [<0>] fuse_file_put+0x9a/0xd0 [fuse] [<0>] fuse_release+0x36/0x50 [fuse] [<0>] __fput+0xec/0x2b0 [<0>] task_work_run+0x55/0x90 [<0>] syscall_exit_to_user_mode+0xe9/0x100 [<0>] do_syscall_64+0x43/0xf0 [<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53 The fuseblk server is fuse2fs so there's nothing all that exciting in the server itself. So why is the fuse server calling fuse_file_put? The commit message for the fstest sheds some light on that: "By closing the file descriptor before calling io_destroy, you pretty much guarantee that the last put on the ioctx will be done in interrupt context (during I/O completion). Aha. AIO fgets a new struct file from the fd when it queues the ioctx. The completion of the FUSE_WRITE command from userspace causes the fuse server to call the AIO completion function. The completion puts the struct file, queuing a delayed fput to the fuse server task. When the fuse server task returns to userspace, it has to run the delayed fput, which in the case of a fuseblk server, it does synchronously. Sending the FUSE_RELEASE command sychronously from fuse server threads is a bad idea because a client program can initiate enough simultaneous AIOs such that all the fuse server threads end up in delayed_fput, and now there aren't any threads left to handle the queued fuse commands. Fix this by only using asynchronous fputs when closing files, and leave a comment explaining why.
Product status
5a18ec176c934ca1bc9dc61580a5e0e90a9b5733 (git) before 548e1f2bac1d4df91a6138f26bb4ab00323fd948
5a18ec176c934ca1bc9dc61580a5e0e90a9b5733 (git) before cfd1aa3e2b71f3327cb373c45a897c9028c62b35
5a18ec176c934ca1bc9dc61580a5e0e90a9b5733 (git) before 83b375c6efef69b1066ad2d79601221e7892745a
5a18ec176c934ca1bc9dc61580a5e0e90a9b5733 (git) before bfd17b6138df0122a95989457d8e18ce0b86165e
5a18ec176c934ca1bc9dc61580a5e0e90a9b5733 (git) before b26923512dbe57ae4917bafd31396d22a9d1691a
5a18ec176c934ca1bc9dc61580a5e0e90a9b5733 (git) before f19a1390af448d9e193c08e28ea5f727bf3c3049
5a18ec176c934ca1bc9dc61580a5e0e90a9b5733 (git) before 26e5c67deb2e1f42a951f022fdf5b9f7eb747b01
9efe56738fecd591b5bf366a325440f9b457ebd6 (git)
5c46eb076e0a1b2c1769287cd6942e4594ade1b1 (git)
83e6726210d6c815ce044437106c738eda5ff6f6 (git)
23d154c71721fd0fa6199851078f32e6bd765664 (git)
ca3edc920f5fd7d8ac040caaf109f925c24620a0 (git)
2.6.38
Any version before 2.6.38
5.10.246 (semver)
5.15.196 (semver)
6.1.158 (semver)
6.6.115 (semver)
6.12.54 (semver)
6.17.4 (semver)
6.18 (original_commit_for_fix)
References
git.kernel.org/...c/548e1f2bac1d4df91a6138f26bb4ab00323fd948
git.kernel.org/...c/cfd1aa3e2b71f3327cb373c45a897c9028c62b35
git.kernel.org/...c/83b375c6efef69b1066ad2d79601221e7892745a
git.kernel.org/...c/bfd17b6138df0122a95989457d8e18ce0b86165e
git.kernel.org/...c/b26923512dbe57ae4917bafd31396d22a9d1691a
git.kernel.org/...c/f19a1390af448d9e193c08e28ea5f727bf3c3049
git.kernel.org/...c/26e5c67deb2e1f42a951f022fdf5b9f7eb747b01