feedback about juju after using it for a few months

Wed Dec 17 23:41:18 UTC 2014

On Wed Dec 17 2014 at 6:27:25 PM Caio Begotti <caio1982 at gmail.com> wrote:

> On Wed, Dec 17, 2014 at 9:02 PM, Marco Ceppi <marco at ondina.co> wrote:
>>
>> I'm curious, what version of Juju are you currently using?
>>
>
> LTS, so Trusty :-)
>

That version of Juju is a bit behind (sadly), we're working to get an
update to the trusty-update archive so the latest stable is available to
LTS users, but you may be interested in using juju 1.20.14 (latest stable)
available from ppa:juju/stable (you can run `juju version` to print the
version of juju you currently have - it is likely 1.18.X) This PPA also
houses several other tools which are considered stable, (charm-tools,
amulet, juju-deployer, etc)

> This is something I'm actually working on addressing by adding `juju local
>> suspend` and `juju local resume` commands via a `juju-local` plugin:
>> https://github.com/juju-solutions/juju-local I hope to have this out for
>> the new year. I'll also be cramming more functionality to make using the
>> local provider much more reliable and easy.
>>
>
> I basically use my laptop for all work stuff so I see that a lot (and
> learned how to ignore this issue), so I think I can be your guinea pig once
> guinea pigs are needed.
>

I will make sure to CC you when I announce the first alpha/beta :D

> So, shell charms are fine, and we have a quite a few that are written
>> well. We can discourage people from using them, but juju and charms is
>> about choice and freedom. If an author wants to write charms in bash that's
>> fine - we will just hold them to the same standard as all other charms.
>> Something we've been diligently working on is charm testing. We're nearing
>> the conclusion of the effort to add some semblance of testing to each charm
>> and run those charms against all substrates and architectures we support.
>> In doing so we can find poorly written charms and charms written well
>> (regardless of language of charm).
>>
>
> Is this testing infra already ready to use and being enforced?
>
> I saw two of my charms (one is a new subordinate charm for OpenID support
> in Apache, another is a new API relation to Jenkins) failing because of
> automatic tests and I thought "hmm ok, that wasn't here before", specially
> because the failures didn't seem to be related to the actual code I was
> introducing.
>

Yes, we're still ironing out some details, but the charm testing is
occurring and working in various degrees. The charm-testing infrastructure
will comment on new charm submissions as well as merge requests with the
testing results and a link to investigate further. We still have some
infrastructure bugs but we're very close to resolving these issues.

This does severely affect performance on the local provider, but juju is
>> designed to run events asynchronously in an environment. File a bug/feature
>> request for this at http://bugs.launchpad.net/juju-core/+filebug to
>> request that LXC deployments be done serially.
>>
>
> In a rush now so please excuse the brevity in the description:
> https://bugs.launchpad.net/juju-core/+bug/1403674
>

Great, thanks! The core team will use the bug to discuss this idea further.

7. When a hook fails (most usually during relations being set) I have to
>>> manually run resolved unit/0 multiple times. It's not enough to call it
>>> once and wait for Juju to get it straight. I have to babysit the unit and
>>> keep running resolved unit/0, while I imagined this should be automatic
>>> because I wanted it resolved for real anyway. If the failed hook was the
>>> first in a chain, you'll have to re-run this for every other hook in the
>>> sequence. Once for a relation, another for config-changed, then perhaps
>>> another for the stop hook and another one for start hook, depending on your
>>> setup.
>>>
>>
>> What charm is causing this issue? This shouldn't happen, but presumably
>> the failure is due to data or something else not being ready, which is why
>> it's erroring. It sounds like the charm doesn't properly guard against data
>> not being ready, which I'll cover, again below.
>>
>
> IIRC I think I saw that with Apache's and Jenkins' (and also my own charm
> for an application).
>

I'd check for idempotency guarding in the charm, verify that the data you
need is actually set before proceeding. If you'd like to email me a link to
the charm I'd be happy to take a cursory look.

> Instead, check for variables you need from the relation, and if they don't
>> exist yet simply `exit 0`. Juju will re-queue the hook to execute when data
>> on the wire is changed. IE: the remote unit finally runs the appropriate
>> `relation-set` line.
>>
>
> Does it wait for that data to be available or it keeps on executing the
> rest of the charm's hooks chain then come back to the first hook which
> exited 0 and still need data?
>

No, the idea is, since the data available will never change during the
execution of a hook (to ensure consistency with all calls), you check if
you have what you need, then exit 0 if you don't Juju will think the hook
exited cleanly and proceed with the rest of the event queue. If at anytime
the remote unit changes any of the data on the relation wire, the
relation-changed event will be queued and executed when its time comes in
the queue.

I realize now we do a very poor job of explaining this in the docs, I'll
open a bug against the docs to have this pattern better explained.

9. If you want to cancel a deployment that just started you need to keep
>>> running remove-service forever. Juju will simply ignore you if it's still
>>> running some special bits of the charm or if you have previously asked it
>>> to cancel the deployment during its setting up. No errors, no other
>>> messages are printed. You need to actually open its log to see that it's
>>> still stuck in a long apt-get installation and you have to wait until the
>>> right moment to remove-service again. And if your connection is slow, that
>>> takes time, you'll have to babysit Juju here because it doesn't really
>>> control its services as I imagined. Somehow apt-get gets what it wants :-)
>>>
>>
>> You can now force-kill a machine. So you can run `juju destroy-service
>> $service` then `juju terminate-machine --force #machine_number`. Just make
>> sure that nothing else exists on that machine! I'll raise an issue for
>> having a way to add a --force flag to destroying a service so you can just
>> say "kill this with fire, now plz"
>>
>
> I understand that, but I discovered it's way faster and less typing if I
> simply destroy-environment and bootstrap it again. If you need to force
> kill something every time you need to kill it, then perhaps somethings is
> wrong?
>

I agree, something is wrong with the UX here. We need to (and would love
your feedback) figure out what should happen here. The idea is, if a
service experiences a hook failure, all events are halted, including the
destroy event. So the service is marked as dying but it can't die until the
error is resolved. There are cases, where during unit termination, that you
may wish to inspect an error. I think adding a `--force` flag to destroy
service would satisfy what you've outlined, where --force will ignore hook
errors during the destruction of a service.

Thanks,
Marco Ceppi

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju/attachments/20141217/2d44a180/attachment.html>