[Bug 1888812] Re: mdmonitor doesn't start recovery immediately

Mariusz Tkaczyk 1888812 at bugs.launchpad.net
Tue Sep 14 12:58:59 UTC 2021


yes, it is resolved.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mdadm in Ubuntu.
https://bugs.launchpad.net/bugs/1888812

Title:
  mdmonitor doesn't start recovery immediately

Status in mdadm package in Ubuntu:
  Incomplete
Status in mdadm source package in Impish:
  Incomplete

Bug description:
  mdmonitor reacts on md events, it pools on /proc/mdstat file. Those
  events are generated if a change on any mddevice is observed in
  kernel. This is done asynchronously and can be caused by user space
  process (mdadm called by udev or user), or by kernel itself (drive is
  removed because it has to many errors).

  The problem here is that mdmonitor isn't dealing with user space or
  udev. When drive with metadata is inserted, mdadm adds it to mddevice
  (it is done by udev). Md Event is generated then and mdmonitor may try
  to move drive to other mddevice if needed. It relies on by-path links,
  but this link to newly appeared device is not created yet, udev is
  still working on. As a result recovery doesn't start immediately.

  Observed on Ubuntu 20.04.

  Steps to reproduce:

  1. Create RAID volume:
  # mdadm --create /dev/md/imsm0 --metadata=imsm --raid-devices=4 /dev/nvme6n1 /dev/nvme1n1 /dev/nvme7n1 /dev/nvme3n1 --run
  # mdadm --create /dev/md/r10d4s64-20_A --level=10 --chunk 64 --raid-devices=4 /dev/nvme6n1 /dev/nvme1n1 /dev/nvme7n1 /dev/nvme3n1 --run

  2. Add spare to container:
  # mdadm --add /dev/md/imsm0 /dev/nvme0n1

  3. Create appropriate policy line in /etc/mdadm/mdadm.conf.

  POLICY domain=RAID_DOMAIN_1 path=* action=spare-same-slot

  4. Disconnect spare from container.

  5. Start mdadm monitor with big delay (ex. 10 minutes):
  # mdadm --monitor --delay 6000 --scan --mail=root at localhost --daemonize --syslog

  6. Hot remove disk from array (physical disconnect).

  7. Connect previously prepared spare.

  Expected results:
  Rebuild should start.

  Actual results:
  Rebuild does not start, added spare is in separate container.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1888812/+subscriptions




More information about the foundations-bugs mailing list