[Bug 1915678] Re: iSCSI+Multipath: Volume attachment hungs if sessiong scanning fails
Brett Milford
1915678 at bugs.launchpad.net
Fri Nov 5 05:43:12 UTC 2021
** Description changed:
+ [Impact]
+
+ * If some commands like "iscsiadm -m session" fail, the thread can abort immediately without updating any counters like failed_logins or stopped_threads properly, because there are no try-except block to catch exceptions.
+ * The main thread keeps waiting until these counters are updated, and this results in stuck of volume attachment process.
+
+ [Test Case]
+
+ * Deploy Cinder with an iSCSI driver and multipath enabled.
+
+ * TBC.
+
+ [Where problems could occur]
+
+ * Change primarily introduces error handling and doesn't change implementation details.
+ As such we may see an error condition logged.
+
+ [Other Info]
+
+ * -
+
+ --- Original Description ---
+
Currently we execute login to iscsi portals and device discovery in
multiple threads concurrently when multipath is enabled.
However if some commands like "iscsiadm -m session" fail, the thread can abort immediately without updating any counters like failed_logins or stopped_threads properly, because there are no try-except block to catch exceptions.
However the main thread keeps waiting until these counters are updated, and this results in stuck of volume attachment process.
This issue was initially reported in downstream bug https://bugzilla.redhat.com/show_bug.cgi?id=1923975 , and maybe is caused by a bug in iscsiadm command.
However we should handle the error more properly because current behavior requires operators to restart services like cinder-volume to resolve the stuck.
** Also affects: python-os-brick (Ubuntu Hirsute)
Importance: Undecided
Status: New
** Also affects: python-os-brick (Ubuntu Impish)
Importance: Undecided
Status: New
** Also affects: python-os-brick (Ubuntu Jammy)
Importance: Undecided
Status: New
** Changed in: python-os-brick (Ubuntu Hirsute)
Status: New => Fix Released
** Changed in: python-os-brick (Ubuntu Impish)
Status: New => Fix Released
** Changed in: python-os-brick (Ubuntu Jammy)
Status: New => Fix Released
** Changed in: python-os-brick (Ubuntu Focal)
Status: New => In Progress
** Changed in: python-os-brick (Ubuntu Focal)
Assignee: (unassigned) => Brett Milford (brettmilford)
** Changed in: python-os-brick (Ubuntu Bionic)
Status: New => In Progress
** Changed in: python-os-brick (Ubuntu Bionic)
Assignee: (unassigned) => Brett Milford (brettmilford)
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to python-os-brick in Ubuntu.
https://bugs.launchpad.net/bugs/1915678
Title:
iSCSI+Multipath: Volume attachment hungs if sessiong scanning fails
Status in os-brick:
Fix Released
Status in python-os-brick package in Ubuntu:
Fix Released
Status in python-os-brick source package in Bionic:
In Progress
Status in python-os-brick source package in Focal:
In Progress
Status in python-os-brick source package in Hirsute:
Fix Released
Status in python-os-brick source package in Impish:
Fix Released
Status in python-os-brick source package in Jammy:
Fix Released
Bug description:
[Impact]
* If some commands like "iscsiadm -m session" fail, the thread can abort immediately without updating any counters like failed_logins or stopped_threads properly, because there are no try-except block to catch exceptions.
* The main thread keeps waiting until these counters are updated, and this results in stuck of volume attachment process.
[Test Case]
* Deploy Cinder with an iSCSI driver and multipath enabled.
* TBC.
[Where problems could occur]
* Change primarily introduces error handling and doesn't change implementation details.
As such we may see an error condition logged.
[Other Info]
* -
--- Original Description ---
Currently we execute login to iscsi portals and device discovery in
multiple threads concurrently when multipath is enabled.
However if some commands like "iscsiadm -m session" fail, the thread can abort immediately without updating any counters like failed_logins or stopped_threads properly, because there are no try-except block to catch exceptions.
However the main thread keeps waiting until these counters are updated, and this results in stuck of volume attachment process.
This issue was initially reported in downstream bug https://bugzilla.redhat.com/show_bug.cgi?id=1923975 , and maybe is caused by a bug in iscsiadm command.
However we should handle the error more properly because current behavior requires operators to restart services like cinder-volume to resolve the stuck.
To manage notifications about this bug go to:
https://bugs.launchpad.net/os-brick/+bug/1915678/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list