Schema migration process

Thu May 29 13:47:49 UTC 2014

On Thu, May 29, 2014 at 4:00 PM, Menno Smits <menno.smits at canonical.com>
wrote:

> Team Onyx is accumulating tasks that can't be completed until the schema
> migrations infrastructure is in place so I've started digging in to the
> details to figure out what exactly needs to be done. I'm aiming to come up
> with an approach and some the work items for the initial incarnation of
> schema migrations.
>
> *Overall Approach*
>
> Building on John's thoughts, and adding Tim's and mine, here's what I've
> got so far::
>
> - Introduce a "database-version" key into the EnvironConfig document which
> tracks the Juju version that the database schema matches. More on this
> later.
>

For clarity, I would probably avoid putting this key into EnvironConfig,
but instead have it in a separate document. That also makes it easy to
watch for just this value changing.

Potentially, I would decouple the value in this key from the actual agent
versions. Otherwise you do null DB schema upgrades on every minor release.
Maybe that's sane, but it *feels* like they are too separate issues. (what
is the version of the DB schema is orthogonal to what version of the code
I'm running.) It may be that the clarity and simplification of just one
version wins out.

>
> - Introduce a MasterStateServer upgrade target which marks upgrade steps
> which are only to run on the master state server. Also more below.
>

This is just a compiled-in list of steps to apply, right?

>
> During an upgrade:
>
> - State machine agents get upgraded first - other agents don't see the new
> agent version until the state machine agents are running the new version (I
> have double-checked that this is already in place)
>
> - When state machines are restarted following an upgrade, don't allow
> other agents to Login until the upgrade is finished. Allow Client
> connections however (for monitoring during upgrades as discussed by John).
>
> - Non-master JobManageEnviron machine agents run their upgrade steps as
> usual and then watch for EnvironConfig changes. They don't consider the
> upgrade to be complete (and therefore let their other workers start) until
> database-version matches agent-version. This prevents the new version of
> the state server agents from running before the schema migrations for the
> new software version have run.
>

I'm not sure if schema should be done before or after other upgrade steps.
Given we're really stopping the world here, it might be prudent to just
wait to do your upgrade steps until you know that the DB upgrade has been
done.

>
> - The master machine agent waits for the other JobManageEnviron agents to
> have upgraded to the new version and then runs the upgrade steps including
> those for the MasterStateServer upgrade target (these steps will be the
> schema upgrades). The wait ensures that the database schema isn't upgraded
> before a state server is ready for the changes (this could happen if a
> non-master state server is slow or the master is especially fast).
>
> - Once the master machine agent has completed its upgrade steps it updates
> database-version in EnvironConfig to the new Juju version. This signals to
> the other state machine agents that they can complete their upgrades,
> allowing their workers to start.
>
> - At this point, all state servers have completed their upgrades and allow
> agent logins.
>
> - Now that agents can connect to the upgraded state servers, they see the
> new agent version and upgrade themselves (this functionality already exists)
>
>
> *Observations/Questions/Issues*
>
> - There are a lot of moving parts here. What could be made simpler?
>
> - What do we do if the master mongo database or host fails during the
> upgrade? Is it a goal for one of the other state servers take over and run
> the schema upgrades itself and let the upgrade finish? If so, is this a
> must-have up-front requirement or a nice-to-have?
>

Some thoughts:

   1. If the actual master mongo DB fails, that will cause reelection,
   which should cause all of the servers to get their connections to Mongo
   bounced, and then they'll notice that there is a new master who is
   responsible for applying the database changes.
   2. If it is just the master Juju process that fails, I don't think there
   is any great expectation that a different process running the same code is
   going to succeed, is there?
   3. There is also a fair possibility that the schema migration we've
   written won't work with real data in the wild. (we assumed this field was
   never written, but suddenly it is, etc). We've talked about the ability to
   have Upgrade roll back, and maybe we could consider that here. Some
   possible steps are:
      1. Copy the db to another location
      2. Try to apply the schema updates (either in place or only to the
      backup)
      3. If upgrade fails, roll back to the old version, and update the
      AgentVersion in environ config so that the other agents will try to
      "upgrade" themselves back to the old version. This would also be a reason
      to do the DB schema before actually applying any other upgrade steps. We
      probably want some sort of "could not upgrade because of"
tracking here, so
      that it can be reported to the user
   4. As long as we do some sort of "backup before applying the change" we
   allow users a way to recover the system if something failed. If we have
   proper Backup support integrated into core, one option is that we just
   trigger a backup and then upgrade in place, if stuff breaks, we at least
   have *something* that should be recoverable.

> - Upgrade steps currently have access to State but I think this probably
> won't be sufficient to perform many types of schema migrations (i.e.
> accessing defunct fields, removing fields, adding indexes etc). Do we want
> to extend State to provide a number of schema migration helpers or do we
> expose mongo connections directly to the upgrade steps?
>

I believe the existing Upgrade logic actually has access to the API not to
State itself, so we'll need something there. The State object has raw mongo
collections on it (environs, charms, etc).
DB Schema (IMO) inherently is going to be at the raw DB level, vs changes
in the abstract objects. (I expect that it will be defined in terms of
Apply this function to all entities in this collection, rather than iterate
over Machine objects and set data on them.)
I could be wrong, but it does seem like we'll want the syntax of db schema
changes to be on mgo.Collection objects, and not on State objects.

>
> - There is a possibility that a non-master state server won't upgrade,
> blocking the master from completing the upgrade. Should there be a timeout
> before the master gives up on state servers upgrading themselves and
> performs its own upgrade steps anyway?
>

Arguably this is a better case for "rollback" than "just move forward".

>
> - Given the order of documents a juju system stores, it's likely that the
> schema migration steps will be quite quick, even for a large installation.
>
>
"order of magnitude" right?
Yeah, we're talking megabytes, GB being really large, not many GB of data.

>
> I'm new to all this so please chime in with your suggestions and
> corrections. Team Onyx (well, me at least) is likely to start on this next
> week.
>
> - Menno
>
> p.s. I'm out until Tuesday so it's unlikely I'll see (or at least respond
> to) any replies until then.
>
>
John
=:->

>
>
>
>
>
>
> On 16 May 2014 17:29, John Meinel <john at arbash-meinel.com> wrote:
>
>> So I'm pretty sure the ability to do schema upgrades was scoped in the HA
>> work that Nate is working on. I believe the idea is roughly:
>>
>> 1) State machines get restarted first (there is code today to not let
>> other agents notice that there is an upgrade ready until the state machine
>> itself has been upgraded)
>>
>> 2) When the state machine is restarted, don't allow other agents to Login
>> until the API Worker has connected to the API Server. This should mean that
>> we are in a relatively quiet mode.
>>
>> 2a) For reasons I'll describe later, still allow Client connections.
>>
>> 3) When the APIWorker comes up and sees that there is an Upgrade that
>> just happened and Upgrade actions are pending, wait to start the upgrade
>> steps until we observe that all other machines with JobManageEnviron have
>> been upgraded to the new version. Continue blocking Login requests.
>>
>> 4) Once all JobManageEnviron machine agent docs report the correct Agent
>> Version, then the Singleton Upgrader (so the one running on the machine
>> with the master Mongo DB), applies the DB Schema Upgrade steps.
>>
>> There is a potential failure mode where the some of the API Servers
>> either don't upgrade themselves, or the stop themselves and never come back
>> up again, and the system as a whole can get wedged. Which is why we still
>> allow Client connections.
>>
>> It means that changes to the DB *could* be occurring, but I think we need
>> that ability. Arguably we'd love to restrict what could possibly be
>> requested to only those things that could fix the bugs in Upgrade. (If we
>> ever get nice Facades in place for subsets of Client functionality, this
>> might actually be possible.) However, I believe we won't be running the
>> Provisioning agent and other such workers (as they should all be
>> startAfterUpgrade). So the changes to the DB shouldn't actually trigger
>> anything until the schema has been migrated.
>>
>> Misc:
>> I believe the discussion for actually applying the schema update is that
>> we should clone all of mongo (possibly stopping it), and then apply the
>> schema changes. We could apply them in the duplicate and then switch over,
>> but I think that is going to be a lot trickier with how we do things (we
>> normally have exactly one set of named collections to think about, and only
>> one DB port that we can connect to, etc).
>>
>> So I think it is still reasonable to run a "backup" set of tasks once we
>> actually have a reliable built-in backup mechanism, and then just upgrade
>> in place. We have a copy on the side if we need it.
>>
>> I could even go as far as having the DB Schema explictly versioned and
>> agents knowing what version of the schema they want to use. And use that
>> information to trigger the step (3) above. And if you make the last step of
>> DB Schema Upgrades be to update the version, even if you get interrupted
>> during upgrades, you should be able to restart.
>>
>> Thoughts?
>> John
>> =:->
>>
>>
>> On Fri, May 16, 2014 at 7:19 AM, Tim Penhey <tim.penhey at canonical.com>
>> wrote:
>>
>>> Hi folks,
>>>
>>> We have been talking about how to handle database schema updates for a
>>> long time, and many times things were deferred as we had clients
>>> connecting directly to the database.
>>>
>>> Due to the hard work of the team, we are no in a place where this is no
>>> longer the case as all external connections come through the API server.
>>>
>>> We also now have a central place in the code where upgrade steps should
>>> live (the upgrades package).
>>>
>>> For some upcoming work we have around identity, we need to update the
>>> schema of existing documents. A big change is changing the identity
>>> field (_id).  This change is going to need to be done for almost all the
>>> documents in the database as we move to work on the multi-environment
>>> state server task.  Many of the existing documents, like machine,
>>> service and unit use the id or name of the entity in the environment as
>>> the primary (_id) key.  This isn't going to work in a state server that
>>> has multiple environments.
>>>
>>> So... handling schema upgrades is a very urgent need for the juju team
>>> this cycle.  I though it best to reach out prior to putting too much
>>> work into it to solicit ideas that people may already have had, or work
>>> that has been done elsewhere.
>>>
>>> Cheers,
>>> Tim
>>>
>>> --
>>> Juju-dev mailing list
>>> Juju-dev at lists.ubuntu.com
>>> Modify settings or unsubscribe at:
>>> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>>>
>>
>>
>> --
>> Juju-dev mailing list
>> Juju-dev at lists.ubuntu.com
>> Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju-dev/attachments/20140529/4d092b60/attachment-0001.html>