CVE-2023-54180 | THREATINT

Description

In the Linux kernel, the following vulnerability has been resolved: btrfs: handle case when repair happens with dev-replace [BUG] There is a bug report that a BUG_ON() in btrfs_repair_io_failure() (originally repair_io_failure() in v6.0 kernel) got triggered when replacing a unreliable disk: BTRFS warning (device sda1): csum failed root 257 ino 2397453 off 39624704 csum 0xb0d18c75 expected csum 0x4dae9c5e mirror 3 kernel BUG at fs/btrfs/extent_io.c:2380! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 9 PID: 3614331 Comm: kworker/u257:2 Tainted: G OE 6.0.0-5-amd64 #1 Debian 6.0.10-2 Hardware name: Micro-Star International Co., Ltd. MS-7C60/TRX40 PRO WIFI (MS-7C60), BIOS 2.70 07/01/2021 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs] RIP: 0010:repair_io_failure+0x24a/0x260 [btrfs] Call Trace: <TASK> clean_io_failure+0x14d/0x180 [btrfs] end_bio_extent_readpage+0x412/0x6e0 [btrfs] ? __switch_to+0x106/0x420 process_one_work+0x1c7/0x380 worker_thread+0x4d/0x380 ? rescuer_thread+0x3a0/0x3a0 kthread+0xe9/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 [CAUSE] Before the BUG_ON(), we got some read errors from the replace target first, note the mirror number (3, which is beyond RAID1 duplication, thus it's read from the replace target device). Then at the BUG_ON() location, we are trying to writeback the repaired sectors back the failed device. The check looks like this: ret = btrfs_map_block(fs_info, BTRFS_MAP_WRITE, logical, &map_length, &bioc, mirror_num); if (ret) goto out_counter_dec; BUG_ON(mirror_num != bioc->mirror_num); But inside btrfs_map_block(), we can modify bioc->mirror_num especially for dev-replace: if (dev_replace_is_ongoing && mirror_num == map->num_stripes + 1 && !need_full_stripe(op) && dev_replace->tgtdev != NULL) { ret = get_extra_mirror_from_replace(fs_info, logical, *length, dev_replace->srcdev->devid, &mirror_num, &physical_to_patch_in_first_stripe); patch_the_first_stripe_for_dev_replace = 1; } Thus if we're repairing the replace target device, we're going to trigger that BUG_ON(). But in reality, the read failure from the replace target device may be that, our replace hasn't reached the range we're reading, thus we're reading garbage, but with replace running, the range would be properly filled later. Thus in that case, we don't need to do anything but let the replace routine to handle it. [FIX] Instead of a BUG_ON(), just skip the repair if we're repairing the device replace target device.

PUBLISHED Reserved 2025-12-30 | Published 2025-12-30 | Updated 2025-12-30 | Assigner Linux

Product status

Default status

unaffected

1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (git) before a7018b40b49c37fb55736499f790ec0d2b381ae4

affected

1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (git) before 53e9d6851b56626885476a2966194ba994f8bb4b

affected

1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (git) before d73a27b86fc722c28a26ec64002e3a7dc86d1c07

affected

Default status

affected

6.0.19 (semver)

unaffected

6.1.5 (semver)

unaffected

6.2 (original_commit_for_fix)

unaffected

References

git.kernel.org/...c/a7018b40b49c37fb55736499f790ec0d2b381ae4

git.kernel.org/...c/53e9d6851b56626885476a2966194ba994f8bb4b

git.kernel.org/...c/d73a27b86fc722c28a26ec64002e3a7dc86d1c07

cve.org (CVE-2023-54180)

nvd.nist.gov (CVE-2023-54180)

Download JSON