opaque ids vs. natural keys
Ian Booth
ian.booth at canonical.com
Thu May 30 06:58:18 UTC 2013
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I think we can add a natural key to Machine without introducing any schema
incompatibilities. From a business perspective, we really just want to store the
namespaced nested "path" to the machine. We can retain the current id as the
surrogate key since we never re-used a machine id (if I understand correctly).
And we can provide a mechanism to get the natural key and if empty, just use the
existing id, which is what non-nested machine's natural key values would be.
On 29/05/13 18:48, William Reade wrote:
> FWIW, we do currently have opaque, unique (in-environment, anyway) keys for
> one entity: the relation. That has Key and Id fields (mapped in mongo, ever
> so helpfully, to "_id" and "id" respectively) where Key is the
> human-readable one (space-separated endpoint names in canonical order) and
> Id is just an environment-unique int [0]. It should be noted that we're
> probably *not* free to change the form of the "id" field, because it's
> exposed to charms; I think that I do favour env-unique entity ids in
> general, (that is, plain old ints rather than uuids), because we can always
> combine an env uuid with an env-unique entity id to get a universally
> unique entity id.
>
> Service just has Name -> _id, as does Unit, and those are problematic
> because while they're env-unique at any given *point* in time that property
> is not guaranteed over any given time *period*, and this is problematic for
> the GUI [1]. Machine has Id -> _id, and extending the semantics of Id
> simply puts it in the same situation as Service and Unit.
>
> So, basically, I'm making a consistency argument for extending machine _id
> in this way: that we currently use poor "primary keys", that encode
> important information primarily for human benefit, for every other entity.
> I don't think there's any argument *against* also providing parallel
> "primary keys" that are actually opaque and unique... *except* that it's a
> schema change for which we are not currently prepared (it's high on the
> list but not being actively developed). So, pragmatically, we can make
> valid progress on containerization without blocking on MV upgrades, and the
> only cost we thereby take on is to slightly increase the size of the
> "fix-the-ids" task we know we'll have to undertake before too long anyway.
>
> I'm still open to arguments that it's *fundamentally* bad to encode this
> information in a string rather than to break it out into separate fields --
> and if you can convince me of that I concede we have no option but to
> develop this in a parallel branch and suck up a merge once we have MV
> upgrades on trunk -- but I think that the PK argument is misplaced in our
> current situation.
>
> Thoughts?
>
> Cheers
> William
>
> [0] gaah: it's not indexed, so uniqueness is not actually enforced, but
> that at least is an easy fix (and the generation of unique values appears
> solid -- we can skip, but we can't dupe, AFAICT -- so that's probably ok).
>
> [1] and, indeed, for anything that watches service/unit deltas. It's not
> really a problem *internally*, because we can't open ourselves up to unit
> confusion by removing a service while any of its units still exist; and we
> don't have anything that watches all services... except the AllWatcher, I
> suspect, which was developed purely for the use of the GUI and is thus, in
> a sense, a protrusion of the GUI into core.
>
>
> On Wed, May 29, 2013 at 8:08 AM, John Arbash Meinel
> <john at arbash-meinel.com>wrote:
>
> ...
>>>>> I can see the desire for having natural keys that we show the
>>>>> user, but history has shown (over and over) that having opaque
>>>>> unique ids provides a better identity story.
>>>>>
>>>>
>>>> Having natural keys that the user sees and opaque, surrogate keys
>>>> for DB identity are not mutually exclusive :-)
>>>>
>>>>
>
> I view it a bit as the revno vs revision_id that bzr went with
> revno: context sensitive 'natural' ids that is reasonably easy for a
> human to grasp
> revision_id: actual unique identifier for the object
>
> I especially like having UUIDs for machines when you start talking
> about cross-environment relations. Since at that point you can start
> reasoning about any object in any environment without having to also
> pass around all the context for the identifier all the time.
>
> However, this all seems very much state-breaking changes. Which we've
> been explicitly cautioned to be careful with.
>
> John
> =:->
>>
>> --
>> Juju-dev mailing list
>> Juju-dev at lists.ubuntu.com
>> Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iJwEAQECAAYFAlGm+IkACgkQCJ79BCOJFcZlcAQAmyazywmbz10gLKpLOJb7eSMt
IjucQe2LHX0R0zm4rCuPck/Mz9wYBEgunYz0FsWxURWaIO77oX2KuWyituvQ1TB4
8m4udRIxP+Wi2P7Hyua284ikTYuHV2KhU4E+u/TWNdkW7L4TbmOvdyD5M+jO691I
9p58TrwwPA0+No9aH/E=
=UD/p
-----END PGP SIGNATURE-----
More information about the Juju-dev
mailing list