[Bug 2025563] Re: System can not shutdown if system has multiple VROC RAID arrays
Cyrus Lien
2025563 at bugs.launchpad.net
Fri Sep 15 02:57:44 UTC 2023
** Changed in: oem-priority
Status: Confirmed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/2025563
Title:
System can not shutdown if system has multiple VROC RAID arrays
Status in OEM Priority Project:
Fix Released
Status in systemd package in Ubuntu:
Fix Released
Status in systemd source package in Jammy:
Fix Released
Status in systemd source package in Kinetic:
Fix Released
Bug description:
[ Impact ]
The system can not shutdown if the system has multiple VROC RAID arrays.
Intel has fixed it in systemd v251 [1].
Need to cherry-pick the commit to ubuntu-jammy systemd 249.11-0ubuntu3.9.
[1] The commit fixes the issue:
commit 3a3b022d2cc112803ea7b9beea98bbcad110368a
Author: Mariusz Tkaczyk <mariusz.tkaczyk at linux.intel.com>
Date: Tue Mar 29 12:49:54 2022 +0200
shutdown: get only active md arrays.
Current md_list_get() implementation filters all block devices, started from
"md*". This is ambiguous because list could contain:
- partitions created upon md device (mdXpY)
- external metadata container- specific type of md array.
For partitions there is no issue, because they aren't handle STOP_ARRAY
ioctl sent later. It generates misleading errors only.
Second case is more problematic because containers are not locked in kernel.
They are stopped even if container member array is active. For that reason
reboot or shutdown flow could be blocked because metadata manager cannot be
restarted after switch root on shutdown.
Add filters to remove partitions and containers from md_list. Partitions
can be excluded by DEVTYPE. Containers are determined by MD_LEVEL
property, we are excluding all with "container" value.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk at linux.intel.com>
In the journal, we can see systemd-shutdown looping repeatedly as it
tries and fails to detach all md devices:
...
[ 513.416293] systemd-shutdown[1]: Stopping MD /dev/md124p2 (259:5).
[ 513.422953] systemd-shutdown[1]: Could not stop MD /dev/md124p2: Device or resource busy
[ 513.431227] systemd-shutdown[1]: Stopping MD /dev/md124p1 (259:4).
[ 513.437952] systemd-shutdown[1]: Could not stop MD /dev/md124p1: Device or resource busy
[ 513.449298] systemd-shutdown[1]: Stopping MD /dev/md124 (9:124).
[ 513.456278] systemd-shutdown[1]: Could not stop MD /dev/md124: Device or resource busy
[ 513.465323] systemd-shutdown[1]: Not all MD devices stopped, 4 left.
[ 513.472564] systemd-shutdown[1]: Couldn't finalize remaining MD devices, trying again.
[ 513.485302] systemd-shutdown[1]: Failed to open watchdog device /dev/watchdog: No such file or directory
[ 513.496195] systemd-shutdown[1]: Stopping MD devices.
[ 513.502176] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
[ 513.513382] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
[ 513.521436] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
[ 513.534810] systemd-shutdown[1]: Stopping MD /dev/md126 (9:126).
[ 513.545384] systemd-shutdown[1]: Failed to sync MD block device /dev/md126, ignoring: Input/output error
[ 513.557265] md: md126 stopped.
[ 513.561451] systemd-shutdown[1]: Stopping MD /dev/md124p2 (259:5).
[ 513.576673] systemd-shutdown[1]: Could not stop MD /dev/md124p2: Device or resource busy
[ 513.589274] systemd-shutdown[1]: Stopping MD /dev/md124p1 (259:4).
[ 513.597976] systemd-shutdown[1]: Could not stop MD /dev/md124p1: Device or resource busy
[ 513.607263] systemd-shutdown[1]: Stopping MD /dev/md124 (9:124).
[ 513.615067] systemd-shutdown[1]: Could not stop MD /dev/md124: Device or resource busy
[ 513.625157] systemd-shutdown[1]: Not all MD devices stopped, 4 left.
[ 513.632209] systemd-shutdown[1]: Couldn't finalize remaining MD devices, trying again.
[ 513.641474] systemd-shutdown[1]: Failed to open watchdog device /dev/watchdog: No such file or directory
[ 513.653660] systemd-shutdown[1]: Stopping MD devices.
[ 513.661257] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
[ 513.668833] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
[ 513.677347] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
[ 513.687047] systemd-shutdown[1]: Stopping MD /dev/md126 (9:126).
[ 513.697206] systemd-shutdown[1]: Failed to sync MD block device /dev/md126, ignoring: Input/output error
[ 513.707193] md: md126 stopped.
...
[ Test Plan ]
1. Build two VROC RAID. One RAID 0 for System volume, another RAID 10 for Data volume.
2. Install system on System volume.
3. Update systemd.
4. Reboot the system.
5. Verify if the system can reboot.
[ Where problems could occur ]
The patch confirmed fixed the reboot issue on the system with two VROC
RAIDs but more than two VROC RAIDs and the combinations of RAID levels
are not all tested. The patch itself adds logic to skip partitions and
containers from the list of md devices to try and stop. Therefore any
regressions would also be related to stopping md devices in systemd-
shutdown.
[ Scope ]
Jammy
To manage notifications about this bug go to:
https://bugs.launchpad.net/oem-priority/+bug/2025563/+subscriptions
More information about the foundations-bugs
mailing list