API for Upgrader

Sun Jun 23 09:31:35 UTC 2013

On 23 June 2013 07:04, John Arbash Meinel <john at arbash-meinel.com> wrote:
>>> type AgentToolsChange struct { Id string
>>
>> I'm not sure it's necessary to send the id with every watcher
>> response, as it's already known by the watcher.
>
> It depends whether you go with 1 watcher per agent, or 1 watcher
> across all agents. If you go with 1 watcher per agent it is hard to
> select across a dynamic number of channels. Go 1.1 introduces
> something in reflect to do this, but it is still clumsy because you
> have to reflect all of your channels into the right types.
>
> The Go 1.0 syntax I've seen is to fire off a goroutine for each
> channel you want to select, and then have them put everything onto one
> channel. Which leaves you with 1 channel in the end anyway.
>
> Note that LifecycleWatcher watches multiple things and returns a list
> of things that has changed. So while EntityWatcher does 1-by-1,
> LifecycleWatcher does bulk watches.
>
> It would make it easier to write code that could handle more than 1
> thing-being-watched without having to change it later.

As discussed on line, I have great difficulty fighting my YAGNI
instincts here. We're writing more code now to make it possible
to write less code for a future that will almost certainly never
happen. The whole point of this code is to support the upgrader - when
would an upgrader worker (inherently running in a single agent)
ever care about some *other* set of agents being upgraded?

>> I think this might be difficult to achieve. In a HA scenario, how
>> do the various API servers coordinate this information? If we want
>> to do this, it's probably easier and more efficient to coordinate
>> locally.
>
> I'm told we already have a database field for Machine.AgentVersion.
> Which means the logic is:
>
> Machine watchers watch the global AgentVersion (in EnvironmentConfig).
> When that gets updated, it updates the desired Machine versions. When
> the Machine reports back an updated AgentVersion, that triggers the
> Units on that machine to upgrade their agents.

That's not quite right, as currently the agent version on a Machine
is used to report the actual running agent version, not the
proposed version. I think that's still useful information, so
we'd need another field, say proposed-agent-version to do this.

I like the idea of having a separate agent version per entity,
as it makes it possible to canary test new versions, but I'm
not entirely sure about the version cascading (from machine to units etc) idea.

If our objective is to reduce download traffic, a simple filesystem-based
lock around the download would achieve the same thing, I think.

>>> This helps with the "thundering herds" case. Where someone
>>> requests an upgrade, and then every agent ever wakes up and tries
>>> to download new tools. (So if you have a Machine with a Unit and
>>> a subordinate Unit, you would download the tools 3 times, only to
>>> have 2 of them get thrown away)
>>>
>>> 5) While it doesn't make a lot of sense today to have a Watcher
>>> across multiple agents, it is possible we will change
>>> responsibilities (say one upgrader per container).
>>
>> I'm not sure what you mean by "having a Watcher across multiple
>> agents" here.
>
> 1 watcher watching for changes to multiple agents. An example was an
> Upgrader that ran separately from the individual agents. So the
> upgrader could notice that the Machine needed upgrading as well as the
> 3 Uniters running on that machine (and presumably itself as well).

The whole point of the upgrader is that it runs inside
the agent itself, so that the agent can gracefully shut
itself down when upgrading. I'm not sure what we'd gain from choosing
to do otherwise. And even if we do, we're not stuck - we're
free to change API signatures in the above case.

>> I think we should use the existing watcher architecture where we
>> have one watcher resource for each thing being watched and the
>> client calls Next on any given watcher to retrieve the next change.
>> The "bulk" interface simply returns one watcher for each id in the
>> request.
>
> As mentioned earlier, the Go rules about 'select' make this harder to
> actually handle watching more than one thing. And LifecycleWatcher
> already watches multiple things. So it isn't like the existing Watcher
> architecture only ever watches 1 thing.

Agreed. Currently we have a watcher that watches more than one
thing when we will actually want to watch more than one thing.