[Bug 644489] Re: constantly changes /dev/disk/by-id/{scsi, wwn}-* LUN symlinks with multipathing
Ubuntu QA's Bug Bot
bug-stats at murraytwins.com
Mon Sep 19 21:30:13 UTC 2011
** Tags added: testcase
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/644489
Title:
constantly changes /dev/disk/by-id/{scsi,wwn}-* LUN symlinks with
multipathing
Status in Ubuntu:
Invalid
Status in “multipath-tools” package in Ubuntu:
Fix Released
Status in The Lucid Lynx:
Invalid
Status in “multipath-tools” source package in Lucid:
Fix Released
Status in The Maverick Meerkat:
Invalid
Status in “multipath-tools” source package in Maverick:
Fix Released
Status in The Natty Narwhal:
Invalid
Status in “multipath-tools” source package in Natty:
Fix Released
Bug description:
= SRU Justification =
== Impact ==
Multipath-tools is inadvertedly generating UDEV CHANGE events for the SD
block devices under it's control. These change events feedback into the udev
rules, increasing cpu utlilzation, and ruining the multipath aliasing feature,
which allows one to rename a multipath path from a series of letters and
numbers, to a human readable label. It gives users the impression that
the SAN is unstable.
== Solution ==
Change the open() flags in the priority checkers to read only from read write, this
stops sd from generating a change event after the file descriptor has been
closed.
Patch: https://bugs.launchpad.net/ubuntu/+source/multipath-
tools/+bug/644489/+attachment/2177364/+files/multipath-tools-
eliminate-udev-change-events-lp644489.debdiff
== Reproduction ==
Is easy, and doesn't even require a SAN. Since we're dealing with simple SCSI
inquiry cmds any block device will do. Simply install multipath-tools and
execute one of the priority checkers like so:
/sbin/mpath_prio_emc /dev/sda
also, have a window open monitoring udev, udevadm monitor, ensure
no change events to that block device are occurring before hand.
TEST CASE:
root at kickseed:~# udevadm monitor &
[1] 16950
root at kickseed:~# monitor will print the received events for:
UDEV - the event which udev sends out after rule processing
KERNEL - the kernel uevent
root at kickseed:~#
root at kickseed:~# /sbin/mpath_prio_emc /dev/sda
query command indicates error0
root at kickseed:~# KERNEL[1308688009.806317] change /devices/pci0000:00/0000:00:07.0/0000:04:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/target0:0:0/0:0:0:0/block/sda (block)
UDEV [1308688009.823569] change /devices/pci0000:00/0000:00:07.0/0000:04:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/target0:0:0/0:0:0:0/block/sda (block)
root at kickseed:~#
root at kickseed:~#
root at kickseed:~# /sbin/mpath_prio_alua /dev/sda
130
mpath_prio_alua doesn't generate any change events since it's open
flags do not include O_RDRW to begin with.
== regression potential ==
None, it's broken to begin with.
--------------------------
Binary package hint: udev
udevd constantly changes LUN device node symlinks (devices/LUNs, not
the partition nodes) in /dev/disk/by-id. udevd uses ~15% of CPU and
system time is using ~50-60%.
For example:
[jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36; sleep 1; echo '======'; ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sde
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdf
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sde
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdf
======
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sdg
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdh
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sdg
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdh
All other device nodes stay the same, such as the device nodes for the
partitions:
[jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l scsi-360a98000486e5339576f596675735354-part1; sleep 1; echo '======'; ls -l scsi-360a98000486e5339576f596675735354-part1
lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1
======
lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1
I'm not entirely sure whether this is udev's problem or something
related to multipathing. Our most recent experience with multipathing
is the last LTS release (hardy), which doesn't exhibit this behavior
given similar configurations.
[jwm at syslog01.roch.ny:pts/0 ~> sudo multipath -ll
rootvol (360a98000486e5339576f596675735354) dm-1 NETAPP ,LUN
[size=36G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=8][active]
\_ 2:0:2:0 sda 8:0 [active][ready]
\_ 3:0:2:0 sde 8:64 [active][ready]
\_ round-robin 0 [prio=2][enabled]
\_ 3:0:3:0 sdg 8:96 [active][ready]
\_ 2:0:3:0 sdc 8:32 [active][ready]
syslog-data (360a98000486e5339576f596675744c36) dm-0 NETAPP ,LUN
[size=1.0T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=8][active]
\_ 2:0:2:1 sdb 8:16 [active][ready]
\_ 3:0:2:1 sdf 8:80 [active][ready]
\_ round-robin 0 [prio=2][enabled]
\_ 3:0:3:1 sdh 8:112 [active][ready]
\_ 2:0:3:1 sdd 8:48 [active][ready]
[jwm at syslog01.roch.ny:pts/0 ~> cat /etc/multipath.conf
multipaths {
multipath {
wwid 360a98000486e5339576f596675735354
alias rootvol
}
multipath {
wwid 360a98000486e5339576f596675744c36
alias syslog-data
}
}
devices {
device {
vendor "NETAPP "
product "LUN "
path_checker tur
path_grouping_policy group_by_prio
prio_callout "/sbin/mpath_prio_netapp /dev/%n"
failback immediate
rr_min_io 128
no_path_retry queue
}
}
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/644489/+subscriptions
More information about the foundations-bugs
mailing list