[Bug 208469] Re: open-iscsi can fail to start if any portals are unavailable, even if other portals are fine
ubuntu at treblig.org
Sat Sep 1 23:24:25 UTC 2012
I'm going to mark fixed-release, because my attempts on a couple of VMs seem to be successful. It does take a good minute or so for the login on the 2nd portal to time out, but it does eventually carry on.
I'll attach my configs in a sec - can you check that the configs/setup I
have makes sense to you; I don't iscsi much.
** Changed in: open-iscsi (Ubuntu)
Status: New => Fix Released
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to open-iscsi in Ubuntu.
open-iscsi can fail to start if any portals are unavailable, even if
other portals are fine
Status in “open-iscsi” package in Ubuntu:
Binary package hint: open-iscsi
Many enterprise-class iSCSI systems use multiple portals to access a
single target. For example you can have 2 or 3 different network
pathways to a target, each with its own portal IP. open-iscsi
supports this fine, however the current startup scripts do not handle
pathway failure gracefully.
Imagine this scenario:
A target: iqn.2008-03.com.something.target.t
On startup, open-iscsi goes through the list of portals and brings
them up one by one. It seems to progress in alphabetical order, so in
the case above 10.0.10.1 will start up first. Once the session has
been established to that portal, open-iscsi will proceed to the next
portal, and so on. Once we are up and running, multipath handles
portal failures, load balancing, etc. After boot time we can safely
kill off 2 of the 3 portals and we still have a pathway to our iSCSI
At boot time however, open-iscsi will fail to start if it hits a
portal that cannot be reached. In the case above where we have a
triple-redundant portal we only need one of the 3 portals to be
"alive". However, as soon as open-iscsi hits a "dead" portal it will
exit with a failure and will not proceed to the next portal. This
actually provides significantly LOWER reliability as we could easily
have one portal offline for maintenance, etc without impacting running
systems, but no other systems would be able to start up or be rebooted
even though the SAN itself is still operational!
To manage notifications about this bug go to:
More information about the foundations-bugs