Automatic environment filtering for DB queries

Menno Smits menno.smits at canonical.com
Tue Nov 18 21:44:14 UTC 2014


Thanks John and Tim. I really like these ideas, especially because it means
the team doesn't need to learn a new way of working (and remember to keep
using the new way). In most cases, the thing returned by getCollection()
will be able to be used in the same way as before, even though it'll
actually be a different type of thing. I'll try this approach out in the
next day or so.

I have thought about what can be done about ensuring env UUIDs are
correctly added or updated during inserts and updates but I don't think
there's anything practical we can do there. I think we need to rely on
diligence and testing to ensure that writes to the DB correctly handle
environment UUIDs.



On 18 November 2014 17:03, Tim Penhey <tim.penhey at canonical.com> wrote:

> Wrapping the collection objects that are returned from the
> `getCollection` method shouldn't be too hard and it could wrap the Find
> and FindId calls.  It would have fixed this missed call below:
>
> // aliveUnitsCount returns the number a alive units for the service.
> func aliveUnitsCount(service *Service) (int, error) {
>         units, closer := service.st.getCollection(unitsC)
>         defer closer()
>
>         query := bson.D{{"service", service.doc.Name}, {"life", Alive}}
>         return units.Find(query).Count()
> }
>
> However it is not just finding that we need to care about, but setting
> and updating the collections.  Hopefully testing would cover the cases
> where we aren't setting what we think we are setting, but that is much
> harder to catch as the main execution flow is to just "run these
> transaction operations".
>
> Tim
>
>
> On 18/11/14 16:45, John Meinel wrote:
> > I've had this around to think about for a bit. I think it is ok, though
> > sniffing an input parameter to change behavior seems brittle. Mostly
> > because the object isn't really design to work that way. Could we wrap
> > the objects so that we just have Find/FindId do the right thing to start?
> >
> > I suppose that is a fair amount of more work. I certainly do like the
> > idea of having a common pattern rather than requiring everyone to know
> > exactly whether this object is different than the rest.
> >
> > John
> > =:->
> >
> >
> > On Thu, Nov 13, 2014 at 8:10 AM, Menno Smits <menno.smits at canonical.com
> > <mailto:menno.smits at canonical.com>> wrote:
> >
> >     Team Onyx has been busy preparing the MongoDB collections used by
> >     Juju to support data for multiple environments. For each collection
> >     that needs to store data for multiple environments we have been
> >     adding the a "env-uuid" field to each document and prefixing the
> >     environment UUID to the document id, with the previous document id
> >     being moved to a new field. Apart from the document changes
> >     themselves, we've been adjusting the state package implementation to
> >     match the document changes.
> >
> >     Part of this task is ensuring that all DB queries correctly filter
> >     by environment so that we don't end up unintentionally leaking data
> >     across environments. To avoid opportunities for us to forget to add
> >     environment filtering to the DB queries using by Juju, sabdfl would
> >     like us to consider ways to make this filtering happen
> >     automatically. To this end, I propose we add the following methods:
> >
> >         func (s *State) Find(coll *mgo.Collection, sel bson.D) *mgo.Query
> >         func (s *State) FindId(coll *mgo.Collection, id string)
> *mgo.Query
> >
> >
> >     The idea is that almost all MongoDB queries performed by Juju would
> >     use these 2 methods. They would become the default way that we do
> >     queries, used even on collections that don't contain data for
> >     multiple environments.
> >
> >     Both methods would check the collection passed in against a fixed
> >     set of collections that use environment UUIDs. If the collection
> >     doesn't use environment UUIDs then the lookup is passed through to
> >     mgo unmodified. If the collection /does/ use environment UUIDs then
> >     Find() would automatically add the appropriate "env-uuid" field to
> >     the query selector.  Similarly, FindId() would automatically call
> >     docID() on the supplied id. Pretty simple.
> >
> >     If use of these methods becomes default way the team does DB queries
> >     in Juju code, then we greatly reduce the risk of leaking data
> >     between environments. They also allows us to remove one concern from
> >     each Find/FindId call site - as environment filtering is taken care
> >     of by these methods, it does not having to be repeated all
> >     throughout the codebase. To get us started, I intend to perform a
> >     mass-refactoring of all existing Find and FindId calls to use these
> >     new methods.
> >
> >     To make the proposal clearer, here's some examples:
> >
> >     Find call:   err := units.Find(bson.D{{"env-uuid": ...}, {"service":
> >     service}}).All{&docs)
> >       becomes:   err := st.Find(units, bson.D{{"service":
> >     service}}).All{&docs)
> >
> >     FindId call: err = units.FindId(w.st.docID(unitName)).One(&doc)
> >         becomes: err = w.st.FindId(units, unitName).One(&doc)
> >
> >     Does this sound reasonable? Is there another approach I should be
> >     considering?
> >
> >     - Menno
> >
> >
> >
> >     --
> >     Juju-dev mailing list
> >     Juju-dev at lists.ubuntu.com <mailto:Juju-dev at lists.ubuntu.com>
> >     Modify settings or unsubscribe at:
> >     https://lists.ubuntu.com/mailman/listinfo/juju-dev
> >
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju-dev/attachments/20141119/73789c15/attachment.html>


More information about the Juju-dev mailing list