[Bug 690387] Re: udev block naming breaks failover and sd kref release cycle
Serge Hallyn
690387 at bugs.launchpad.net
Thu Jul 21 16:29:15 UTC 2011
** Description changed:
Binary package hint: multipath-tools
+
+ ================================================================
+ SRU Justification:
+ 1. Impact: multipath fail-over events are not handled by multipathd. The system therefore does not cleanly survive FOFB.
+ 2. How addressed: a bug from upstream is cherry-picked which fixes multipathd to look for the correct device pathnames in uevents it sees.
+ 3. Patch: see debdiff attached
+ 4. To reproduce: Start multipathd with -v4 argument. Reset an SCM to initiate FOFB. Watch /var/log/daemon.log for uevents relating to drives. Without the patch they don't show up. Additionally, upon failback, new block device names will be used (i.e. /dev/sdo instead of /dev/sda).
+ 5. Regression potential: the patch is to multipathd, so non-multipath users will not be affected. Multipathd users with an older kernel which uses the old pathanames could get the same behavior which uptodate users now have without this patch.
+
+ ================================================================
This was exposed on the Intel IMS SAN which is an ODM'd Promise Vtrak
variant on 10.04 server. The SAN has Active/Standby capabilities and
is configured for failover. It probably affects other SANs too.
Setup:
multipath'd SAN consisting of SD block devices.
Symptoms:
On failover, multipath isn't gettng the right signals to tear down
the defunct path. This was traced down to the fact that the path UDEV
was presenting to multipath was different from what it was expecting.
It simply dropped the request to gracefully remove the device, and
instead responded to the SCSI mid-layer SD IO state change,
SDEV_CANCEL/DEL which puts the device offline.
Problem is device mapper still has an handle on the SD device, as
can be seen from /sys/block/dm-x/slaves, and as a result,
scsi_target_destroy is never called. The outward symptom of this
is the SD suffix is not recycled because of course the previous
reference never dropped.
Solution:
A fix was developed independently of upstream by Serge Hallyn,
later it was found that it was fixed upstream, in 2008.
The patch is:
commit 7fa7affc3d23dd9dc906804d22a61144bca9f9b9
Author: Benjamin Marzinski <bmarzins at redhat.com>
Date: Thu Dec 11 16:03:28 2008 -0600
Fix for uevent devpath handling
This is necessary to make uevents work on fedora, since devpath appears as
something like:
'/devices/pci0000:00/0000:00:0a.0/0000:06:00.0/host11/rport-11:0-1/target11:0:1/11:0:1:0/block/s
It simply strips off the everything up to the /block.
Signed-off-by: Benjamin Marzinski <bmarzins at redhat.com>
It integrates simply and can be found in PPAs here:
ppa:peter-petrakis/storage
--
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/690387
Title:
udev block naming breaks failover and sd kref release cycle
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/690387/+subscriptions
More information about the Ubuntu-server-bugs
mailing list