API change: juju subcommands will wait for connection to Zookeeper
jim.baker at canonical.com
Wed Nov 16 22:35:06 UTC 2011
A forthcoming branch removes the need to wait before issuing juju
subcommands commands following a bootstrap. This is tracked by this bug:
That is, it should possible to do the following:
$ juju bootstrap && juju status && juju whatever-else-i-want-to-do
and not see an error.
A problematic aspect of current Juju usage is that all subcommands
except bootstrap and destroy-environment require the user to wait some
arbitrary amount of time before they can be successfully run. This is
because these subcommands need to first connect to ZooKeeper. Although
ZK is an important implementation detail of Juju, it's normally not
visible to Juju users. Here it is, because bootstrap is responsible for
starting a machine to run ZK, and ZK needs to be running before anything
else can happen.
Because of this lack of waiting, the Juju user tutorial currently has
If the bootstrapping node has not yet completed bootstrapping, the
status command may either mention the environment is not yet ready,
or may display a connection timeout such as:
INFO Connecting to environment.
ERROR Connection refused
ProviderError: Interaction with machine provider failed:
ConnectionTimeoutException('could not connect before timeout after 2
ERROR ProviderError: Interaction with machine
provider failed: ConnectionTimeoutException('could not connect before timeout
after 2 retries',)
This is simply an indication the environment needs more time to
complete initialization. It is recommended you retry every minute...
Consequently, Juju users have had to come up with various strategies to
automate the waiting in a script, such as Clint's wait4state script or
the similar mechanism seen on wtf.labix.org for running functional tests
(this log shows a good example of what the waiting can look like,
This is essentially just a removal of a wart: a correct set of Juju
commands can be run as a script, from juju bootstrap on, and expect to
complete as a whole, assuming the environment has been bootstrapped and
there are no other issues (bad API keys for EC2, etc).
One alternative was to put the waiting in the bootstrap subcommand, but
this approach feels much cleaner. All subcommands that depend on ZK will
simply be gated in this fashion.
Lastly, one aspect that may be left for a later phase, if done at all,
and is a visible API change is to add a --timeout parameter to the juju
command. The flip side to robust waiting is that the wait is indefinite.
(Internally in the Python code, so long as the connection logic is
reporting that the situation is `EnvironmentPending`, the wait
continues.) Users that require a timeout on a Juju command need to kill
the process (such as with SIGTERM). Although modifications to ZK are
careful, it's certainly possible that this could leave ZK with some
garbage (or possibly in a bad state). So a timeout on the ZK connection
phase could be desirable.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 554 bytes
Desc: OpenPGP digital signature
More information about the Juju