[SRU][N][O][PATCH 0/1] btrfs will WARN_ON() in btrfs_remove_qgroup() unnecessarily
Matthew Ruffell
matthew.ruffell at canonical.com
Fri Jan 24 01:39:36 UTC 2025
BugLink: https://bugs.launchpad.net/bugs/2091719
[Impact]
The following commit for noble and oracular introduced two new WARN_ON() calls
in btrfs qgroup removals, and even though the author at the time believed they
would not be reachable, it turns out it can happen quite frequently in the
right conditions.
ubuntu-noble b2ad25ba539452f492805e5f7d94e80894aa860f
commit a776bf5f3c2300cfdf8a195663460b1793ac9847
Author: Qu Wenruo <wqu at suse.com>
Date: Fri Apr 19 14:29:32 2024 +0930
Subject: btrfs: slightly loosen the requirement for qgroup removal
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a776bf5f3c2300cfdf8a195663460b1793ac9847
$ git describe --contains b2ad25ba539452f492805e5f7d94e80894aa860f
Ubuntu-6.8.0-50.51~143
This primarily affects the systemd CI that runs integration tests on merge:
https://github.com/systemd/systemd/actions/runs/12297539029/job/34318915884?pr=35589
Kernel panic - not syncing: kernel: panic_on_warn set ...
CPU: 0 PID: 1316 Comm: (sd-clean) Not tainted 6.8.0-50-generic #51-Ubuntu
Call Trace:
<TASK>
dump_stack_lvl+0x27/0xa0
dump_stack+0x10/0x20
panic+0x366/0x3c0
? btrfs_remove_qgroup+0x271/0x490 [btrfs]
check_panic_on_warn+0x4f/0x60
__warn+0x95/0x160
? btrfs_remove_qgroup+0x271/0x490 [btrfs]
report_bug+0x17e/0x1b0
handle_bug+0x51/0xa0
exc_invalid_op+0x18/0x80
asm_exc_invalid_op+0x1b/0x20
RIP: 0010:btrfs_remove_qgroup+0x271/0x490 [btrfs]
Code: c0 0f 85 27 fe ff ff 48 8b 43 b0 4c 39 f0 75 d5 4d 8d b5 e0 08 00 00 4c 89 f7 e8 8a 45 19 e2 48 83 7b 98 00 0f 84 52 01 00 00 <0f> 0b 49 8b 45 10 a8 10 74 42 41 f6 85 d0 08 00 00 0c 75 38 48 83
? btrfs_remove_qgroup+0x266/0x490 [btrfs]
btrfs_ioctl+0x12b9/0x13a0 [btrfs]
? srso_alias_return_thunk+0x5/0xfbef5
? __seccomp_filter+0x368/0x570
? __fput+0x15e/0x2e0
__x64_sys_ioctl+0xa3/0xf0
x64_sys_call+0x12a3/0x25a0
do_syscall_64+0x7f/0x180
entry_SYSCALL_64_after_hwframe+0x78/0x80
[Fix]
The fix just landed in mainline as:
commit c0def46dec9c547679a25fe7552c4bcbec0b0dd2
Author: Qu Wenruo <wqu at suse.com>
Date: Mon Nov 11 07:29:07 2024 +1030
Subject: btrfs: improve the warning and error message for btrfs_remove_qgroup()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c0def46dec9c547679a25fe7552c4bcbec0b0dd2
The commit places the WARN_ON behind CONFIG_BTRFS_DEBUG, which silences the
warning for most users. It is safe to do so, as noted by the Author, as
the user space tool managing the qgroups would rescan them, to fix the
inconsistent view.
This is needed for both noble and oracular.
[Testcase]
The upstream systemd CI tests can consistently reproduce the issue, so the test
and proposed kernels will be run against the systemd CI for verification.
There is a test kernel available in the following ppa:
https://launchpad.net/~mruffell/+archive/ubuntu/lp2091719-test
If you install it, the systemd CI will run to completion.
[Where problems could occur]
We are changing the WARN_ON() to occur only when CONFIG_BTRFS_DEBUG is enabled.
There is no other change in logic, so functionality should be the same as what
we have now.
If a regression were to occur, it would affect systems with btrfs filesystems
that are utilising subvolumes. It would not likely cause any data loss or disk
corruption, as userspace tools should be able to automatically fix up any
inconsistent views without user interaction.
[Other info]
Systemd upstream bisected the issue here:
https://github.com/systemd/systemd/pull/35567#issuecomment-2538160543
Qu Wenruo (1):
btrfs: improve the warning and error message for btrfs_remove_qgroup()
fs/btrfs/qgroup.c | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
--
2.45.2
More information about the kernel-team
mailing list