[Bug 644489] Re: constantly changes /dev/disk/by-id/{scsi, wwn}-* LUN symlinks with multipathing
Peter Petrakis
peter.petrakis at canonical.com
Tue Jun 21 20:34:32 UTC 2011
also affects udev should be removed.
** Changed in: udev (Ubuntu Lucid)
Status: New => Invalid
** Changed in: udev (Ubuntu Maverick)
Status: New => Invalid
** Changed in: udev (Ubuntu Natty)
Status: New => Invalid
** Package changed: udev (Ubuntu) => ubuntu
** Description changed:
+ = SRU Justification =
+
+ == Impact ==
+ Multipath-tools is inadvertedly generating UDEV CHANGE events for the SD
+ block devices under it's control. These change events feedback into the udev
+ rules, increasing cpu utlilzation, and ruining the multipath aliasing feature,
+ which allows one to rename a multipath path from a series of letters and
+ numbers, to a human readable label. It gives users the impression that
+ the SAN is unstable.
+
+ == Solution ==
+ Change the open() flags in the priority checkers to read only from read write, this
+ stops sd from generating a change event after the file descriptor has been
+ closed.
+
+ Patch: https://bugs.launchpad.net/ubuntu/+source/multipath-
+ tools/+bug/644489/+attachment/2177364/+files/multipath-tools-eliminate-
+ udev-change-events-lp644489.debdiff
+
+ == Reproduction ==
+
+ Is easy, and doesn't even require a SAN. Since we're dealing with simple SCSI
+ inquiry cmds any block device will do. Simply install multipath-tools and
+ execute one of the priority checkers like so:
+
+ /sbin/mpath_prio_emc /dev/sda
+
+ also, have a window open monitoring udev, udevadm monitor, ensure
+ no change events to that block device are occurring before hand.
+
+ TEST CASE:
+ root at kickseed:~# udevadm monitor &
+ [1] 16950
+ root at kickseed:~# monitor will print the received events for:
+ UDEV - the event which udev sends out after rule processing
+ KERNEL - the kernel uevent
+ root at kickseed:~#
+ root at kickseed:~# /sbin/mpath_prio_emc /dev/sda
+ query command indicates error0
+ root at kickseed:~# KERNEL[1308688009.806317] change /devices/pci0000:00/0000:00:07.0/0000:04:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/target0:0:0/0:0:0:0/block/sda (block)
+ UDEV [1308688009.823569] change /devices/pci0000:00/0000:00:07.0/0000:04:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/target0:0:0/0:0:0:0/block/sda (block)
+
+ root at kickseed:~#
+ root at kickseed:~#
+ root at kickseed:~# /sbin/mpath_prio_alua /dev/sda
+ 130
+
+ mpath_prio_alua doesn't generate any change events since it's open
+ flags do not include O_RDRW to begin with.
+
+ == regression potential ==
+ None, it's broken to begin with.
+ --------------------------
+
Binary package hint: udev
udevd constantly changes LUN device node symlinks (devices/LUNs, not the
partition nodes) in /dev/disk/by-id. udevd uses ~15% of CPU and system
time is using ~50-60%.
For example:
[jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36; sleep 1; echo '======'; ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sde
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdf
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sde
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdf
======
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sdg
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdh
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sdg
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdh
All other device nodes stay the same, such as the device nodes for the
partitions:
[jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l scsi-360a98000486e5339576f596675735354-part1; sleep 1; echo '======'; ls -l scsi-360a98000486e5339576f596675735354-part1
lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1
======
lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1
-
- I'm not entirely sure whether this is udev's problem or something related to multipathing. Our most recent experience with multipathing is the last LTS release (hardy), which doesn't exhibit this behavior given similar configurations.
-
+ I'm not entirely sure whether this is udev's problem or something
+ related to multipathing. Our most recent experience with multipathing is
+ the last LTS release (hardy), which doesn't exhibit this behavior given
+ similar configurations.
[jwm at syslog01.roch.ny:pts/0 ~> sudo multipath -ll
- rootvol (360a98000486e5339576f596675735354) dm-1 NETAPP ,LUN
+ rootvol (360a98000486e5339576f596675735354) dm-1 NETAPP ,LUN
[size=36G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=8][active]
- \_ 2:0:2:0 sda 8:0 [active][ready]
- \_ 3:0:2:0 sde 8:64 [active][ready]
+ \_ 2:0:2:0 sda 8:0 [active][ready]
+ \_ 3:0:2:0 sde 8:64 [active][ready]
\_ round-robin 0 [prio=2][enabled]
- \_ 3:0:3:0 sdg 8:96 [active][ready]
- \_ 2:0:3:0 sdc 8:32 [active][ready]
- syslog-data (360a98000486e5339576f596675744c36) dm-0 NETAPP ,LUN
+ \_ 3:0:3:0 sdg 8:96 [active][ready]
+ \_ 2:0:3:0 sdc 8:32 [active][ready]
+ syslog-data (360a98000486e5339576f596675744c36) dm-0 NETAPP ,LUN
[size=1.0T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=8][active]
- \_ 2:0:2:1 sdb 8:16 [active][ready]
- \_ 3:0:2:1 sdf 8:80 [active][ready]
+ \_ 2:0:2:1 sdb 8:16 [active][ready]
+ \_ 3:0:2:1 sdf 8:80 [active][ready]
\_ round-robin 0 [prio=2][enabled]
- \_ 3:0:3:1 sdh 8:112 [active][ready]
- \_ 2:0:3:1 sdd 8:48 [active][ready]
- [jwm at syslog01.roch.ny:pts/0 ~> cat /etc/multipath.conf
+ \_ 3:0:3:1 sdh 8:112 [active][ready]
+ \_ 2:0:3:1 sdd 8:48 [active][ready]
+ [jwm at syslog01.roch.ny:pts/0 ~> cat /etc/multipath.conf
multipaths {
- multipath {
- wwid 360a98000486e5339576f596675735354
- alias rootvol
- }
- multipath {
- wwid 360a98000486e5339576f596675744c36
- alias syslog-data
- }
+ multipath {
+ wwid 360a98000486e5339576f596675735354
+ alias rootvol
+ }
+ multipath {
+ wwid 360a98000486e5339576f596675744c36
+ alias syslog-data
+ }
}
devices {
- device {
- vendor "NETAPP "
- product "LUN "
- path_checker tur
- path_grouping_policy group_by_prio
- prio_callout "/sbin/mpath_prio_netapp /dev/%n"
- failback immediate
- rr_min_io 128
- no_path_retry queue
- }
+ device {
+ vendor "NETAPP "
+ product "LUN "
+ path_checker tur
+ path_grouping_policy group_by_prio
+ prio_callout "/sbin/mpath_prio_netapp /dev/%n"
+ failback immediate
+ rr_min_io 128
+ no_path_retry queue
+ }
}
--
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/644489
Title:
constantly changes /dev/disk/by-id/{scsi,wwn}-* LUN symlinks with
multipathing
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/644489/+subscriptions
More information about the Ubuntu-server-bugs
mailing list