RFC: mongo "_id" fields in the multi-environment juju server world
John Meinel
john at arbash-meinel.com
Fri Jul 4 05:42:57 UTC 2014
According to the mongo docs:
http://docs.mongodb.org/manual/core/document/#record-documents
The field name _id is reserved for use as a primary key; its value must be
unique in the collection, is immutable, and may be of any type other than
an array.
That makes it sound like we *could* use an object for the _id field and do
_id = {env_uuid:, name:}
Though I thought the purpose of doing something like that is to allow
efficient sharding in a multi-environment world.
Looking here: http://docs.mongodb.org/manual/core/sharding-shard-key/
The shard key must be indexed (which is just fine for us w/ the primary _id
field or with any other field on the documents), and "The index on the
shard key *cannot* be a *multikey index
<http://docs.mongodb.org/manual/core/index-multikey/#index-type-multikey>".*
I don't really know what that means in the case of wanting to shard based
on an object instead of a simple string, but it does sound like it might be
a problem.
Anyway, for purposes of being *unique* we may need to put environ uuid in
there, but for the purposes of sharding we could just put it on another
field and index that field.
John
=:->
On Fri, Jul 4, 2014 at 5:01 AM, Tim Penhey <tim.penhey at canonical.com> wrote:
> Hi folks,
>
> Very shortly we are going to start on the work to be able to store
> multiple environments within a single mongo database.
>
> Most of our current entities are stored in the database with their name
> or id fields serialized to bson as the _id field.
>
> As far as I know (and I may be wrong), if you are adding a document to
> the mongo collection, and you do not specify an _id field, mongo will
> create a unique value for you.
>
> In our new world, things that used to be unique, like machines,
> services, units etc, are now only unique when paired with the
> environment id.
>
> It seems we have a number of options here.
>
> 1. change the _id field to be a "composed" field where it is the
> concatenation of the environment id and the existing id or name field.
> If we do take this approach, I strongly recommend having the fields that
> make up the key be available by themselves elsewhere in the document
> structure.
>
> 2. let mongo create the _id field, and we ensure uniqueness over the
> pair of values with a unique index. One think I am unsure about with
> this approach is how we currently do our insertion checks, where we do a
> "document does not exist" check. We wouldn't be able to do this as a
> transaction assertion as it can only check for _id values. How fast are
> the indices updated? Can having a unique index for a document work for
> us? I'm hoping it can if this is the way to go.
>
> 3. use a composite _id field such that the document may start like this:
> { _id: { env_uuid: "blah", name: "foo"}, ...
> This gives the benefit of existence checks, and real names for the _id
> parts.
>
> Thoughts? Opinions? Recommendations?
>
> BTW, I think that if we can make 3 work, then it is the best approach.
>
> Tim
>
> --
> Juju-dev mailing list
> Juju-dev at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju-dev/attachments/20140704/461f7228/attachment.html>
More information about the Juju-dev
mailing list