[Bug 1623700] Re: [SRU] multipath iscsi does not logout of sessions on xenial

Gustavo Randich gustavo.randich at gmail.com
Wed Mar 22 13:11:15 UTC 2017


Hi Joshua, I'll check this and give you feedback. Thanks!


On Wed, 22 Mar 2017 at 09:31, Hua Zhang <joshua.zhang at canonical.com>
wrote:

> @Gustavo,
>
> Liang's comment in patch set 3 [1] can explain your problem, he said:
>
> The dev can disappear momentarily right after 'multipath -r "dev"'. So
> it doesn't happen for every single path. If it did, it would cause a lot
> more issues. The multipath dev removal path reloads the dev near the
> beginning of the operation (_rescan_multipath). Thus the "stat" here can
> fail if it is executed before the dev node being re-created.
>
> 1, 'multipath -r' in _rescan_multipath() [10] can make the multipath dev
> disappear momentarily [9] due to the bug [9].
>
> 2, _get_multipath_device_name() [2] uses 'multipath -ll' command to find
> multipath device name [3], so we saw:
>
> Jan 25 09:24:40 Lock "connect_volume" acquired by
> "os_brick.initiator.connector.disconnect_volume" :: waited 0.000s
> ...
> Jan 25 09:24:40 multipath ['-ll', u'/dev/sdr']:
> stdout=360080e5000297ea40000050658885f45 dm-6 NETAPP,INF-01-00#012
>
> 3, _linuxscsi.remove_multipath_device() [4] will invoke
> remove_multipath_device() [4], so we saw:
>
> Jan 25 09:24:40 remove multipath device /dev/sdr'
>
> 4, then find_multipath_device() will be invoked [5], then 'multipath -l'
> will be invoked [6]
>
> 5, the "stat" right after 'multipath -r' here [7] can fail if it is
> executed before the dev node being re-created. so we saw:
>
> Jan 25 09:24:40 Couldn't find multipath device
> /dev/mapper/360080e5000297ea40000050658885f45
>
> So the fix [1] was trying to fix this problem, but it was abandoned later
> because we already have the fix [8], that's also why I am trying to
> backport it.
> FYI, the root cause of your problem is a bug in multipath-tools [9], you
> can also fix the problem by upgrading multipath-tools.
>
> [1] https://review.openstack.org/#/c/366065
> [2]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L925
> [3]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L1200
> [4]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L935
> [5]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/linuxscsi.py#L124
> [6]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/linuxscsi.py#L263
> [7]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/linuxscsi.py#L288
> [8] https://review.openstack.org/#/c/374421/
> [9] https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1621340
> [10]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L918
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1623700
>
> Title:
>   [SRU] multipath iscsi does not logout of sessions on xenial
>
> Status in Ubuntu Cloud Archive:
>   Fix Released
> Status in Ubuntu Cloud Archive mitaka series:
>   Triaged
> Status in Ubuntu Cloud Archive newton series:
>   Triaged
> Status in os-brick:
>   Fix Released
> Status in python-os-brick package in Ubuntu:
>   Fix Released
> Status in python-os-brick source package in Xenial:
>   In Progress
> Status in python-os-brick source package in Yakkety:
>   Triaged
>
> Bug description:
>   [Impact]
>
>    * The reload (multipath -r) in _rescan_multipath can cause
>   /dev/mapper/<wwid> to be deleted and re-created (bug #1621340 is used
>   to track this problem), it would cause a lot more downstream openstack
>   issues. For example, and right in between that, os.stat(mdev) called
>   by _discover_mpath_device() will fail to find the file. For example,
>   when detaching a volume the iscsi sessions are not logged out. This
>   leaves behind a mpath device and the iscsi /dev/disk/by-path devices
>   as broken luns. So we should stop calling multipath -r when
>   attaching/detaching iSCSI volumes, multipath will load devices on its
>   own.
>
>   [Test Case]
>
>    * Enable iSCSI driver and cinder/nova multipath
>    * Detach a iSCSI volume
>    * Check that devices/symlinks do not get messed up mentioned below
>
>   [Regression Potential]
>
>    * None
>
>
>   stack at xenial-devstack-master-master-20160914-092014:~$ nova
> volume-attach 6e1017a7-6dea-418f-ad9b-879da085bd13
> d1d68e04-a217-44ea-bb74-65e0de73e5f8
>   +----------+--------------------------------------+
>   | Property | Value                                |
>   +----------+--------------------------------------+
>   | device   | /dev/vdb                             |
>   | id       | d1d68e04-a217-44ea-bb74-65e0de73e5f8 |
>   | serverId | 6e1017a7-6dea-418f-ad9b-879da085bd13 |
>   | volumeId | d1d68e04-a217-44ea-bb74-65e0de73e5f8 |
>   +----------+--------------------------------------+
>
>   stack at xenial-devstack-master-master-20160914-092014:~$ cinder list
>
> +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+
>   | ID                                   | Status | Name | Size | Volume
> Type | Bootable | Attached to                          |
>
> +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+
>   | d1d68e04-a217-44ea-bb74-65e0de73e5f8 | in-use | -    | 1    |
> pure-iscsi  | false    | 6e1017a7-6dea-418f-ad9b-879da085bd13 |
>
> +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+
>
>   stack at xenial-devstack-master-master-20160914-092014:~$ nova list
>
> +--------------------------------------+------+--------+------------+-------------+---------------------------------+
>   | ID                                   | Name | Status | Task State |
> Power State | Networks                        |
>
> +--------------------------------------+------+--------+------------+-------------+---------------------------------+
>   | 6e1017a7-6dea-418f-ad9b-879da085bd13 | test | ACTIVE | -          |
> Running     | public=172.24.4.12, 2001:db8::b |
>
> +--------------------------------------+------+--------+------------+-------------+---------------------------------+
>
>   stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m
> session
>   tcp: [5] 10.0.1.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>   tcp: [6] 10.0.5.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>   tcp: [7] 10.0.1.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>   tcp: [8] 10.0.5.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>   stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m
> node
>   10.0.1.11:3260,-1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
>   10.0.5.11:3260,-1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
>   10.0.5.10:3260,-1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
>   10.0.1.10:3260,-1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
>
>   stack at xenial-devstack-master-master-20160914-092014:~$ sudo tail -f
> /var/log/syslog
>   Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get udev
> uid: Invalid argument
>   Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get sysfs
> uid: Invalid argument
>   Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get sgio
> uid: No such file or directory
>   Sep 14 22:33:14 xenial-qemu-tester systemd[1347]:
> dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device: Dev
> dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device appeared
> twice with different sysfs paths
> /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and
> /sys/devices/virtual/block/dm-0
>   Sep 14 22:33:14 xenial-qemu-tester systemd[1347]:
> dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device: Dev
> dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device appeared
> twice with different sysfs paths
> /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and
> /sys/devices/virtual/block/dm-0
>   Sep 14 22:33:14 xenial-qemu-tester systemd[1]:
> dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device: Dev
> dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device appeared
> twice with different sysfs paths
> /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and
> /sys/devices/virtual/block/dm-0
>   Sep 14 22:33:14 xenial-qemu-tester systemd[1]:
> dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device: Dev
> dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device appeared
> twice with different sysfs paths
> /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and
> /sys/devices/virtual/block/dm-0
>   Sep 14 22:33:14 xenial-qemu-tester kernel: [22362.163521] audit:
> type=1400 audit(1473892394.556:21): apparmor="STATUS"
> operation="profile_replace" profile="unconfined"
> name="libvirt-6e1017a7-6dea-418f-ad9b-879da085bd13" pid=32665
> comm="apparmor_parser"
>   Sep 14 22:33:14 xenial-qemu-tester kernel: [22362.173614] audit:
> type=1400 audit(1473892394.568:22): apparmor="STATUS"
> operation="profile_replace" profile="unconfined"
> name="libvirt-6e1017a7-6dea-418f-ad9b-879da085bd13//qemu_bridge_helper"
> pid=32665 comm="apparmor_parser"
>   Sep 14 22:33:14 xenial-qemu-tester iscsid: Connection8:0 to [target:
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873, portal:
> 10.0.5.11,3260] through [iface: default] is operational now
>
>   stack at xenial-devstack-master-master-20160914-092014:~$ nova
> volume-detach 6e1017a7-6dea-418f-ad9b-879da085bd13
> d1d68e04-a217-44ea-bb74-65e0de73e5f8
>   stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m
> session
>   tcp: [5] 10.0.1.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>   tcp: [6] 10.0.5.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>   tcp: [7] 10.0.1.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>   tcp: [8] 10.0.5.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>
>   stack at xenial-devstack-master-master-20160914-092014:~$ cinder list
>
> +--------------------------------------+-----------+------+------+-------------+----------+-------------+
>   | ID                                   | Status    | Name | Size |
> Volume Type | Bootable | Attached to |
>
> +--------------------------------------+-----------+------+------+-------------+----------+-------------+
>   | d1d68e04-a217-44ea-bb74-65e0de73e5f8 | available | -    | 1    |
> pure-iscsi  | false    |             |
>
> +--------------------------------------+-----------+------+------+-------------+----------+-------------+
>
>   stack at xenial-devstack-master-master-20160914-092014:~$ iscsiadm -m
> session
>   tcp: [5] 10.0.1.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>   tcp: [6] 10.0.5.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>   tcp: [7] 10.0.1.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>   tcp: [8] 10.0.5.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>
>   stack at xenial-devstack-master-master-20160914-092014:~$ sudo tail -f
> /var/log/syslog
>   Sep 14 22:48:10 xenial-qemu-tester kernel: [23257.736455]
> connection6:0: detected conn error (1020)
>   Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742036]
> connection5:0: detected conn error (1020)
>   Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742066]
> connection7:0: detected conn error (1020)
>   Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742139]
> connection8:0: detected conn error (1020)
>   Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742156]
> connection6:0: detected conn error (1020)
>   Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747638]
> connection5:0: detected conn error (1020)
>   Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747666]
> connection7:0: detected conn error (1020)
>   Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747710]
> connection8:0: detected conn error (1020)
>   Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747737]
> connection6:0: detected conn error (1020)
>   Sep 14 22:48:16 xenial-qemu-tester iscsid: message repeated 67 times: [
> conn 0 login rejected: initiator failed authorization with target]
>   Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.753999]
> connection6:0: detected conn error (1020)
>   Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754019]
> connection8:0: detected conn error (1020)
>   Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754105]
> connection5:0: detected conn error (1020)
>   Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754146]
> connection7:0: detected conn error (1020)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cloud-archive/+bug/1623700/+subscriptions
>

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1623700

Title:
  [SRU] multipath iscsi does not logout of sessions on xenial

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive mitaka series:
  Triaged
Status in Ubuntu Cloud Archive newton series:
  Triaged
Status in os-brick:
  Fix Released
Status in python-os-brick package in Ubuntu:
  Fix Released
Status in python-os-brick source package in Xenial:
  In Progress
Status in python-os-brick source package in Yakkety:
  Triaged

Bug description:
  [Impact]

   * multipath-tools has a bug that 'multipath -r' can cause
  /dev/mapper/<wwid> to be deleted and re-created momentarily (bug
  #1621340 is used to track this problem), thus os.stat(mdev) right
  after _rescan_multipath ('multipath -r') in os-brick can fail if it is
  executed before multipath dev being re-created. This will also lead to
  multipath iscsi does not logout of sessions on xenial.

  [Test Case]

   * Enable cinder multipath by adding iscsi_ip_address and iscsi_secondary_ip_addresses in cinder.conf
   * Enable nova multipath by adding iscsi_use_multipath=True in [libvirt] secion of nova.conf
   * Detach a iSCSI volume
   * Check that devices/symlinks do not get messed up mentioned below, or check that multipath device /dev/mapper/<wwid> doesn't be deleted and re-created momentarily

  [Regression Potential]

   * multipath-tools loads devices on its own, we shouldn't need to be
  forcing multipathd to do reload, so there is no regression potential.

  stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m session
  tcp: [5] 10.0.1.10:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
  tcp: [6] 10.0.5.10:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
  tcp: [7] 10.0.1.11:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
  tcp: [8] 10.0.5.11:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)

  stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m node
  10.0.1.11:3260,-1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
  10.0.5.11:3260,-1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
  10.0.5.10:3260,-1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
  10.0.1.10:3260,-1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873

  stack at xenial-devstack-master-master-20160914-092014:~$ sudo tail -f /var/log/syslog
  Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get udev uid: Invalid argument
  Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get sysfs uid: Invalid argument
  Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get sgio uid: No such file or directory
  Sep 14 22:33:14 xenial-qemu-tester systemd[1347]: dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device: Dev dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device appeared twice with different sysfs paths /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and /sys/devices/virtual/block/dm-0
  Sep 14 22:33:14 xenial-qemu-tester systemd[1347]: dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device: Dev dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device appeared twice with different sysfs paths /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and /sys/devices/virtual/block/dm-0
  Sep 14 22:33:14 xenial-qemu-tester systemd[1]: dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device: Dev dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device appeared twice with different sysfs paths /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and /sys/devices/virtual/block/dm-0
  Sep 14 22:33:14 xenial-qemu-tester systemd[1]: dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device: Dev dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device appeared twice with different sysfs paths /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and /sys/devices/virtual/block/dm-0
  Sep 14 22:33:14 xenial-qemu-tester kernel: [22362.163521] audit: type=1400 audit(1473892394.556:21): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-6e1017a7-6dea-418f-ad9b-879da085bd13" pid=32665 comm="apparmor_parser"
  Sep 14 22:33:14 xenial-qemu-tester kernel: [22362.173614] audit: type=1400 audit(1473892394.568:22): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-6e1017a7-6dea-418f-ad9b-879da085bd13//qemu_bridge_helper" pid=32665 comm="apparmor_parser"
  Sep 14 22:33:14 xenial-qemu-tester iscsid: Connection8:0 to [target: iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873, portal: 10.0.5.11,3260] through [iface: default] is operational now

  stack at xenial-devstack-master-master-20160914-092014:~$ nova volume-
  detach 6e1017a7-6dea-418f-ad9b-879da085bd13 d1d68e04-a217-44ea-
  bb74-65e0de73e5f8

  stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m session
  tcp: [5] 10.0.1.10:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
  tcp: [6] 10.0.5.10:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
  tcp: [7] 10.0.1.11:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
  tcp: [8] 10.0.5.11:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)

  stack at xenial-devstack-master-master-20160914-092014:~$ sudo tail -f /var/log/syslog
  Sep 14 22:48:10 xenial-qemu-tester kernel: [23257.736455] connection6:0: detected conn error (1020)
  Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742036] connection5:0: detected conn error (1020)
  Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742066] connection7:0: detected conn error (1020)
  Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742139] connection8:0: detected conn error (1020)
  Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742156] connection6:0: detected conn error (1020)
  Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747638] connection5:0: detected conn error (1020)
  Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747666] connection7:0: detected conn error (1020)
  Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747710] connection8:0: detected conn error (1020)
  Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747737] connection6:0: detected conn error (1020)
  Sep 14 22:48:16 xenial-qemu-tester iscsid: message repeated 67 times: [ conn 0 login rejected: initiator failed authorization with target]
  Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.753999] connection6:0: detected conn error (1020)
  Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754019] connection8:0: detected conn error (1020)
  Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754105] connection5:0: detected conn error (1020)
  Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754146] connection7:0: detected conn error (1020)

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1623700/+subscriptions



More information about the Ubuntu-sponsors mailing list