Zookeeper to MongoDB transition

Tue Jun 12 17:36:13 UTC 2012

Hello team,

One of our goals is to replace Zookeeper. We want to use MongoDB to
store charms, so the idea of using MongoDB for the things we'd normally
use Zookeeper falls naturally. We would not introduce an alternative
dependency and storing all state inside a single software container might
provide some potential useful tighter coupling between pieces of data.

The initial idea was to implement a drop-in package that implemented
the Zookeeper API on top of MongoDB, switch everything to it, and after
things have stabilized start trimming unneeded features and Zookeeper
idiosyncrasies, which are a lot. We have not abandoned this plan, but
we are exploring alternatives.

The problem with the above approach is that we introduce a new, relatively
complex layer for dubious benefit. Yes, we remove a dependency, but our
higher goal is to reduce overall complexity. First of all, we don't even
use Zookeeper that much:

  white:juju$ find . -name '*.go' | grep -v test |
  xargs -n1 9 grep '((zk)|(zookeeper)).*\(' | grep -v func |
  9 sed -e 's/^.*=//g' -e 's/zkConn/zk/g' -e 's/\.conn\./.zk./g'
  -e 's/zookeeper/zk/g' | 9 grep '[^a-zA-Z]zk\.[A-Z]' |
  9 sed -e 's/^.*zk\.([A-Z].*)\(.*$/\1/g' -e 's/\(.*$//g' |
  sort | uniq -c | sort -nr
       18 IsError
       10 Create
        6 Get
        4 RetryChange
        4 Delete
        2 WorldACL
        2 ExistsW
        2 Exists
        2 Dial
        1 GetW
        1 Close
        1 ChildrenW
        1 Children

The internals of the state package are modeled after Zookeeper, for
example the whole topology node business. This is costing us some
complexity. If we do things slightly differently, for example if state
uses MongoDB directly, we might reduce that complexity and we would not
need another complex layer doing translation. Zookeeper forces us to
manage the topology manually, but if we chose the data types right, we
could make MongoDB do all this bookkeeping for free. No need to maintain
the topology when MongoDB could construct it on demand by simple queries.

I have studied the state package today and I believe a transition to
the MongoDB API would make it significantly less complex. The functions
exported by the package map well to MongoDB operations, whereas now the
high level exported functions dive into internal stuff that does redundant
namespace manipulation that would need to be undone by mgokeeper.

I have not yet come with MongoDB schemas for the state data, mainly
because it's foreign code to me. I need to code some prototype code and
see how it feels before I can come with something sensible, but I am
open to suggestions on how to organize the types.

-- 
Aram Hăvărneanu