Schema migration process

Fri May 16 05:29:49 UTC 2014

So I'm pretty sure the ability to do schema upgrades was scoped in the HA
work that Nate is working on. I believe the idea is roughly:

1) State machines get restarted first (there is code today to not let other
agents notice that there is an upgrade ready until the state machine itself
has been upgraded)

2) When the state machine is restarted, don't allow other agents to Login
until the API Worker has connected to the API Server. This should mean that
we are in a relatively quiet mode.

2a) For reasons I'll describe later, still allow Client connections.

3) When the APIWorker comes up and sees that there is an Upgrade that just
happened and Upgrade actions are pending, wait to start the upgrade steps
until we observe that all other machines with JobManageEnviron have been
upgraded to the new version. Continue blocking Login requests.

4) Once all JobManageEnviron machine agent docs report the correct Agent
Version, then the Singleton Upgrader (so the one running on the machine
with the master Mongo DB), applies the DB Schema Upgrade steps.

There is a potential failure mode where the some of the API Servers either
don't upgrade themselves, or the stop themselves and never come back up
again, and the system as a whole can get wedged. Which is why we still
allow Client connections.

It means that changes to the DB *could* be occurring, but I think we need
that ability. Arguably we'd love to restrict what could possibly be
requested to only those things that could fix the bugs in Upgrade. (If we
ever get nice Facades in place for subsets of Client functionality, this
might actually be possible.) However, I believe we won't be running the
Provisioning agent and other such workers (as they should all be
startAfterUpgrade). So the changes to the DB shouldn't actually trigger
anything until the schema has been migrated.

Misc:
I believe the discussion for actually applying the schema update is that we
should clone all of mongo (possibly stopping it), and then apply the schema
changes. We could apply them in the duplicate and then switch over, but I
think that is going to be a lot trickier with how we do things (we normally
have exactly one set of named collections to think about, and only one DB
port that we can connect to, etc).

So I think it is still reasonable to run a "backup" set of tasks once we
actually have a reliable built-in backup mechanism, and then just upgrade
in place. We have a copy on the side if we need it.

I could even go as far as having the DB Schema explictly versioned and
agents knowing what version of the schema they want to use. And use that
information to trigger the step (3) above. And if you make the last step of
DB Schema Upgrades be to update the version, even if you get interrupted
during upgrades, you should be able to restart.

Thoughts?
John
=:->

On Fri, May 16, 2014 at 7:19 AM, Tim Penhey <tim.penhey at canonical.com>wrote:

> Hi folks,
>
> We have been talking about how to handle database schema updates for a
> long time, and many times things were deferred as we had clients
> connecting directly to the database.
>
> Due to the hard work of the team, we are no in a place where this is no
> longer the case as all external connections come through the API server.
>
> We also now have a central place in the code where upgrade steps should
> live (the upgrades package).
>
> For some upcoming work we have around identity, we need to update the
> schema of existing documents. A big change is changing the identity
> field (_id).  This change is going to need to be done for almost all the
> documents in the database as we move to work on the multi-environment
> state server task.  Many of the existing documents, like machine,
> service and unit use the id or name of the entity in the environment as
> the primary (_id) key.  This isn't going to work in a state server that
> has multiple environments.
>
> So... handling schema upgrades is a very urgent need for the juju team
> this cycle.  I though it best to reach out prior to putting too much
> work into it to solicit ideas that people may already have had, or work
> that has been done elsewhere.
>
> Cheers,
> Tim
>
> --
> Juju-dev mailing list
> Juju-dev at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju-dev/attachments/20140516/24a09c08/attachment-0001.html>