[Bug 1623700] Re: [SRU] multipath iscsi does not logout of sessions on xenial
Gustavo Randich
gustavo.randich at gmail.com
Wed Mar 22 13:11:15 UTC 2017
Hi Joshua, I'll check this and give you feedback. Thanks!
On Wed, 22 Mar 2017 at 09:31, Hua Zhang <joshua.zhang at canonical.com>
wrote:
> @Gustavo,
>
> Liang's comment in patch set 3 [1] can explain your problem, he said:
>
> The dev can disappear momentarily right after 'multipath -r "dev"'. So
> it doesn't happen for every single path. If it did, it would cause a lot
> more issues. The multipath dev removal path reloads the dev near the
> beginning of the operation (_rescan_multipath). Thus the "stat" here can
> fail if it is executed before the dev node being re-created.
>
> 1, 'multipath -r' in _rescan_multipath() [10] can make the multipath dev
> disappear momentarily [9] due to the bug [9].
>
> 2, _get_multipath_device_name() [2] uses 'multipath -ll' command to find
> multipath device name [3], so we saw:
>
> Jan 25 09:24:40 Lock "connect_volume" acquired by
> "os_brick.initiator.connector.disconnect_volume" :: waited 0.000s
> ...
> Jan 25 09:24:40 multipath ['-ll', u'/dev/sdr']:
> stdout=360080e5000297ea40000050658885f45 dm-6 NETAPP,INF-01-00#012
>
> 3, _linuxscsi.remove_multipath_device() [4] will invoke
> remove_multipath_device() [4], so we saw:
>
> Jan 25 09:24:40 remove multipath device /dev/sdr'
>
> 4, then find_multipath_device() will be invoked [5], then 'multipath -l'
> will be invoked [6]
>
> 5, the "stat" right after 'multipath -r' here [7] can fail if it is
> executed before the dev node being re-created. so we saw:
>
> Jan 25 09:24:40 Couldn't find multipath device
> /dev/mapper/360080e5000297ea40000050658885f45
>
> So the fix [1] was trying to fix this problem, but it was abandoned later
> because we already have the fix [8], that's also why I am trying to
> backport it.
> FYI, the root cause of your problem is a bug in multipath-tools [9], you
> can also fix the problem by upgrading multipath-tools.
>
> [1] https://review.openstack.org/#/c/366065
> [2]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L925
> [3]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L1200
> [4]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L935
> [5]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/linuxscsi.py#L124
> [6]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/linuxscsi.py#L263
> [7]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/linuxscsi.py#L288
> [8] https://review.openstack.org/#/c/374421/
> [9] https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1621340
> [10]
> https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L918
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1623700
>
> Title:
> [SRU] multipath iscsi does not logout of sessions on xenial
>
> Status in Ubuntu Cloud Archive:
> Fix Released
> Status in Ubuntu Cloud Archive mitaka series:
> Triaged
> Status in Ubuntu Cloud Archive newton series:
> Triaged
> Status in os-brick:
> Fix Released
> Status in python-os-brick package in Ubuntu:
> Fix Released
> Status in python-os-brick source package in Xenial:
> In Progress
> Status in python-os-brick source package in Yakkety:
> Triaged
>
> Bug description:
> [Impact]
>
> * The reload (multipath -r) in _rescan_multipath can cause
> /dev/mapper/<wwid> to be deleted and re-created (bug #1621340 is used
> to track this problem), it would cause a lot more downstream openstack
> issues. For example, and right in between that, os.stat(mdev) called
> by _discover_mpath_device() will fail to find the file. For example,
> when detaching a volume the iscsi sessions are not logged out. This
> leaves behind a mpath device and the iscsi /dev/disk/by-path devices
> as broken luns. So we should stop calling multipath -r when
> attaching/detaching iSCSI volumes, multipath will load devices on its
> own.
>
> [Test Case]
>
> * Enable iSCSI driver and cinder/nova multipath
> * Detach a iSCSI volume
> * Check that devices/symlinks do not get messed up mentioned below
>
> [Regression Potential]
>
> * None
>
>
> stack at xenial-devstack-master-master-20160914-092014:~$ nova
> volume-attach 6e1017a7-6dea-418f-ad9b-879da085bd13
> d1d68e04-a217-44ea-bb74-65e0de73e5f8
> +----------+--------------------------------------+
> | Property | Value |
> +----------+--------------------------------------+
> | device | /dev/vdb |
> | id | d1d68e04-a217-44ea-bb74-65e0de73e5f8 |
> | serverId | 6e1017a7-6dea-418f-ad9b-879da085bd13 |
> | volumeId | d1d68e04-a217-44ea-bb74-65e0de73e5f8 |
> +----------+--------------------------------------+
>
> stack at xenial-devstack-master-master-20160914-092014:~$ cinder list
>
> +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+
> | ID | Status | Name | Size | Volume
> Type | Bootable | Attached to |
>
> +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+
> | d1d68e04-a217-44ea-bb74-65e0de73e5f8 | in-use | - | 1 |
> pure-iscsi | false | 6e1017a7-6dea-418f-ad9b-879da085bd13 |
>
> +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+
>
> stack at xenial-devstack-master-master-20160914-092014:~$ nova list
>
> +--------------------------------------+------+--------+------------+-------------+---------------------------------+
> | ID | Name | Status | Task State |
> Power State | Networks |
>
> +--------------------------------------+------+--------+------------+-------------+---------------------------------+
> | 6e1017a7-6dea-418f-ad9b-879da085bd13 | test | ACTIVE | - |
> Running | public=172.24.4.12, 2001:db8::b |
>
> +--------------------------------------+------+--------+------------+-------------+---------------------------------+
>
> stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m
> session
> tcp: [5] 10.0.1.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
> tcp: [6] 10.0.5.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
> tcp: [7] 10.0.1.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
> tcp: [8] 10.0.5.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
> stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m
> node
> 10.0.1.11:3260,-1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
> 10.0.5.11:3260,-1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
> 10.0.5.10:3260,-1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
> 10.0.1.10:3260,-1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
>
> stack at xenial-devstack-master-master-20160914-092014:~$ sudo tail -f
> /var/log/syslog
> Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get udev
> uid: Invalid argument
> Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get sysfs
> uid: Invalid argument
> Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get sgio
> uid: No such file or directory
> Sep 14 22:33:14 xenial-qemu-tester systemd[1347]:
> dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device: Dev
> dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device appeared
> twice with different sysfs paths
> /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and
> /sys/devices/virtual/block/dm-0
> Sep 14 22:33:14 xenial-qemu-tester systemd[1347]:
> dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device: Dev
> dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device appeared
> twice with different sysfs paths
> /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and
> /sys/devices/virtual/block/dm-0
> Sep 14 22:33:14 xenial-qemu-tester systemd[1]:
> dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device: Dev
> dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device appeared
> twice with different sysfs paths
> /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and
> /sys/devices/virtual/block/dm-0
> Sep 14 22:33:14 xenial-qemu-tester systemd[1]:
> dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device: Dev
> dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device appeared
> twice with different sysfs paths
> /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and
> /sys/devices/virtual/block/dm-0
> Sep 14 22:33:14 xenial-qemu-tester kernel: [22362.163521] audit:
> type=1400 audit(1473892394.556:21): apparmor="STATUS"
> operation="profile_replace" profile="unconfined"
> name="libvirt-6e1017a7-6dea-418f-ad9b-879da085bd13" pid=32665
> comm="apparmor_parser"
> Sep 14 22:33:14 xenial-qemu-tester kernel: [22362.173614] audit:
> type=1400 audit(1473892394.568:22): apparmor="STATUS"
> operation="profile_replace" profile="unconfined"
> name="libvirt-6e1017a7-6dea-418f-ad9b-879da085bd13//qemu_bridge_helper"
> pid=32665 comm="apparmor_parser"
> Sep 14 22:33:14 xenial-qemu-tester iscsid: Connection8:0 to [target:
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873, portal:
> 10.0.5.11,3260] through [iface: default] is operational now
>
> stack at xenial-devstack-master-master-20160914-092014:~$ nova
> volume-detach 6e1017a7-6dea-418f-ad9b-879da085bd13
> d1d68e04-a217-44ea-bb74-65e0de73e5f8
> stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m
> session
> tcp: [5] 10.0.1.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
> tcp: [6] 10.0.5.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
> tcp: [7] 10.0.1.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
> tcp: [8] 10.0.5.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>
> stack at xenial-devstack-master-master-20160914-092014:~$ cinder list
>
> +--------------------------------------+-----------+------+------+-------------+----------+-------------+
> | ID | Status | Name | Size |
> Volume Type | Bootable | Attached to |
>
> +--------------------------------------+-----------+------+------+-------------+----------+-------------+
> | d1d68e04-a217-44ea-bb74-65e0de73e5f8 | available | - | 1 |
> pure-iscsi | false | |
>
> +--------------------------------------+-----------+------+------+-------------+----------+-------------+
>
> stack at xenial-devstack-master-master-20160914-092014:~$ iscsiadm -m
> session
> tcp: [5] 10.0.1.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
> tcp: [6] 10.0.5.10:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
> tcp: [7] 10.0.1.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
> tcp: [8] 10.0.5.11:3260,1
> iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
>
> stack at xenial-devstack-master-master-20160914-092014:~$ sudo tail -f
> /var/log/syslog
> Sep 14 22:48:10 xenial-qemu-tester kernel: [23257.736455]
> connection6:0: detected conn error (1020)
> Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742036]
> connection5:0: detected conn error (1020)
> Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742066]
> connection7:0: detected conn error (1020)
> Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742139]
> connection8:0: detected conn error (1020)
> Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742156]
> connection6:0: detected conn error (1020)
> Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747638]
> connection5:0: detected conn error (1020)
> Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747666]
> connection7:0: detected conn error (1020)
> Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747710]
> connection8:0: detected conn error (1020)
> Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747737]
> connection6:0: detected conn error (1020)
> Sep 14 22:48:16 xenial-qemu-tester iscsid: message repeated 67 times: [
> conn 0 login rejected: initiator failed authorization with target]
> Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.753999]
> connection6:0: detected conn error (1020)
> Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754019]
> connection8:0: detected conn error (1020)
> Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754105]
> connection5:0: detected conn error (1020)
> Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754146]
> connection7:0: detected conn error (1020)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cloud-archive/+bug/1623700/+subscriptions
>
--
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1623700
Title:
[SRU] multipath iscsi does not logout of sessions on xenial
Status in Ubuntu Cloud Archive:
Fix Released
Status in Ubuntu Cloud Archive mitaka series:
Triaged
Status in Ubuntu Cloud Archive newton series:
Triaged
Status in os-brick:
Fix Released
Status in python-os-brick package in Ubuntu:
Fix Released
Status in python-os-brick source package in Xenial:
In Progress
Status in python-os-brick source package in Yakkety:
Triaged
Bug description:
[Impact]
* multipath-tools has a bug that 'multipath -r' can cause
/dev/mapper/<wwid> to be deleted and re-created momentarily (bug
#1621340 is used to track this problem), thus os.stat(mdev) right
after _rescan_multipath ('multipath -r') in os-brick can fail if it is
executed before multipath dev being re-created. This will also lead to
multipath iscsi does not logout of sessions on xenial.
[Test Case]
* Enable cinder multipath by adding iscsi_ip_address and iscsi_secondary_ip_addresses in cinder.conf
* Enable nova multipath by adding iscsi_use_multipath=True in [libvirt] secion of nova.conf
* Detach a iSCSI volume
* Check that devices/symlinks do not get messed up mentioned below, or check that multipath device /dev/mapper/<wwid> doesn't be deleted and re-created momentarily
[Regression Potential]
* multipath-tools loads devices on its own, we shouldn't need to be
forcing multipathd to do reload, so there is no regression potential.
stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m session
tcp: [5] 10.0.1.10:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
tcp: [6] 10.0.5.10:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
tcp: [7] 10.0.1.11:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
tcp: [8] 10.0.5.11:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m node
10.0.1.11:3260,-1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
10.0.5.11:3260,-1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
10.0.5.10:3260,-1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
10.0.1.10:3260,-1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873
stack at xenial-devstack-master-master-20160914-092014:~$ sudo tail -f /var/log/syslog
Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get udev uid: Invalid argument
Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get sysfs uid: Invalid argument
Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get sgio uid: No such file or directory
Sep 14 22:33:14 xenial-qemu-tester systemd[1347]: dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device: Dev dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device appeared twice with different sysfs paths /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and /sys/devices/virtual/block/dm-0
Sep 14 22:33:14 xenial-qemu-tester systemd[1347]: dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device: Dev dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device appeared twice with different sysfs paths /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and /sys/devices/virtual/block/dm-0
Sep 14 22:33:14 xenial-qemu-tester systemd[1]: dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device: Dev dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device appeared twice with different sysfs paths /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and /sys/devices/virtual/block/dm-0
Sep 14 22:33:14 xenial-qemu-tester systemd[1]: dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device: Dev dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device appeared twice with different sysfs paths /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and /sys/devices/virtual/block/dm-0
Sep 14 22:33:14 xenial-qemu-tester kernel: [22362.163521] audit: type=1400 audit(1473892394.556:21): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-6e1017a7-6dea-418f-ad9b-879da085bd13" pid=32665 comm="apparmor_parser"
Sep 14 22:33:14 xenial-qemu-tester kernel: [22362.173614] audit: type=1400 audit(1473892394.568:22): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-6e1017a7-6dea-418f-ad9b-879da085bd13//qemu_bridge_helper" pid=32665 comm="apparmor_parser"
Sep 14 22:33:14 xenial-qemu-tester iscsid: Connection8:0 to [target: iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873, portal: 10.0.5.11,3260] through [iface: default] is operational now
stack at xenial-devstack-master-master-20160914-092014:~$ nova volume-
detach 6e1017a7-6dea-418f-ad9b-879da085bd13 d1d68e04-a217-44ea-
bb74-65e0de73e5f8
stack at xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m session
tcp: [5] 10.0.1.10:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
tcp: [6] 10.0.5.10:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
tcp: [7] 10.0.1.11:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
tcp: [8] 10.0.5.11:3260,1 iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash)
stack at xenial-devstack-master-master-20160914-092014:~$ sudo tail -f /var/log/syslog
Sep 14 22:48:10 xenial-qemu-tester kernel: [23257.736455] connection6:0: detected conn error (1020)
Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742036] connection5:0: detected conn error (1020)
Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742066] connection7:0: detected conn error (1020)
Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742139] connection8:0: detected conn error (1020)
Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742156] connection6:0: detected conn error (1020)
Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747638] connection5:0: detected conn error (1020)
Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747666] connection7:0: detected conn error (1020)
Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747710] connection8:0: detected conn error (1020)
Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747737] connection6:0: detected conn error (1020)
Sep 14 22:48:16 xenial-qemu-tester iscsid: message repeated 67 times: [ conn 0 login rejected: initiator failed authorization with target]
Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.753999] connection6:0: detected conn error (1020)
Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754019] connection8:0: detected conn error (1020)
Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754105] connection5:0: detected conn error (1020)
Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754146] connection7:0: detected conn error (1020)
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1623700/+subscriptions
More information about the Ubuntu-sponsors
mailing list