opaque ids vs. natural keys

Wed May 29 05:15:18 UTC 2013

Hey folks,

I have been having a bit of an internal war (with myself) around
document ids and business model keys.  This is something I want to bring
up, and offer potential solutions.

With the work going on around containers, it was suggested (and
generally agreed upon) that we have machine ids that can be used to
infer container type as well as parent child relationships.

So machine ids:

  "0" is a machine created by the environment provider
  "0/lxc/0" is the first lxc container on machine "0"

  "3/lxc/2" has a parent of "3" and a container type of "lxc"

  The children of "3" are those whose machine IDs match "^3/\w+/\d+$"

This I'm all fine with, providing that Machine ID has the nice quick
indices created for it.

However, where I'm more concerned is the overloading of semantics onto
the primary key of the machine document in mongo.  To me it just feels
wrong, and Ian also has this feeling, although stronger than me.

The thing we have special logic that has to deal with not reusing
machine numbers, when in reality, it is much simpler to have a unique
document id provided by mongo itself (are the default IDs guaranteed to
be unique?).

If we had some opaque *real* unique primary key type fields, this would
also get around the problem identified by these steps:
 * create a new service called blog, and deploy a few units
 * destroy the blog service
 * create a different service called blog, and deploy a few units
 * blog/0 now is different to blog/0 before

If we had a unique opaque key that isn't overloaded with business
meaning, then we'd be able to have this different easily apparent.

I can see the desire for having natural keys that we show the user, but
history has shown (over and over) that having opaque unique ids provides
a better identity story.

Is it time to change the documents so we have opaque id values for
machines, services and units?

Tim