API Bulk Operations

Wed May 29 16:12:17 UTC 2013

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi guys,

I decided to summarize what we discussed lately about API bulk
operations support.

1. Why do we need this?

We'll need bulk operations mostly for clients like the GUI or CLI
commands. Good example is "juju status" - having to issue one multiple
requests per entity is a bad design and won't be usable with large
environments. Now we get all machines, for each one we get the status,
agent alive, constraints, etc. Then all services, each unit, along
with their status, etc. These are a lot of round-trips and considering
a large environment (100 or more machines, services, units, relations)
will take a very long time.

Currently all agents are written with the presumption they'll operate
on single entities (e.g. machine agent, machiner) or they get a list
of entities and operate on each one separately (e.g. provisioner,
firewaller). Each state call translated to an equivalent API call
requires a round-trip. This is inefficient and the agents can be
refactored to handle entities in bulk operations, like "get all
machines", "get the status of all these machines", "get the
constraints for these machines", as 3 separate operations.

2. Implementing support for bulk operations

Currently there are ways to support bulk operations on the
server-side, without changing the client-side of the API. So instead
of having entity-level entry points, like Unit, which needs an id
(name) and proxies state.Unit operations, we can have top-level
"services", like Units, which proxies state.Unit operations in bulk,
taking a list of ids (names) for each operation. In turn the
client-side will still have entity-level entry points, like Unit,
having more or less the same interface like state.Unit, but internally
will call the server-side Units "service" passing its id (name),
effectively operating on a single entity.

3. What about error handling with bulk operations?

This is a non-trivial issue, so needs to be defined. Consider the
following operations:

A. Get([id1, id2, ...]) -> []Result, error
B. Set([[id1, params1], [id2, params2], ...]) -> []error
C. Create([params1, params2, ...]) -> []Result, error

A. Covers most read operations, like Machines.Get(ids),
Machines.Status(ids), etc. We might get an error from state while
getting some or all entities. We might have some more serious issue
with the operation itself (i.e. state connection dropped at the
beginning or half way in the operation). In the first case the final
error is nil, because the operation succeeded, if partially, and
returned a result. In the second case there's a non-nil error and no
results. The results themselves have to be defined like this:

type Result struct {
Error
// Other fields
}
Each result can contain an error for this specific result.
The results have to be in the same order as the passed ids, so the
client can easily and quickly lookup the result by index.

B. Covers most update operations: "do this on a bunch of things and
tell me if each one was successful". Again, the returned errors match
in order and count the passed id/params pairs.

C. Covers a few operations, like AddMachine, AddUnit,
unit.AssignToMachine. Same semantics as for A.: Result contains both a
potential error and any other fields as needed; error is not nil only
when the bulk operation cannot be performed as a whole.

4. If the agents are not using bulk operations, why should we care?

Because supporting bulk operations on the API server-side requires
changes to the protocol exposed over the websocket, and even though we
can mask this for juju-core by keeping the client-side API the same,
we can't for other potential clients, like the GUI.
So we need to think how are we are going to implement it now, even if
we won't need it immediately.

5. Can we have both bulk and single operations in the API?

I don't see why all the operations *have* to be bulk. There are lots
of examples of APIs like that out there, used a lot and their users
are happy. Openstack comes to mind, also AWS.
But having bulk operations in the first place brings up all the
aforementioned things, which need to be resolved.

Sorry it came a lot longer that I expected, but I think I covered all
the important bits.

Let's have the discussion and decide which implementation to use now,
before the API is used everywhere and it's hard to go back.

Cheers,
Dimiter
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRpijhAAoJENzxV2TbLzHwDo0IAKedrlU1X35UIR6NuU7apZd2
0/MH427eHGayRUVUdNsBcnOb7f4EwlgfPWrc3BrOsNqkAU8ru7wqzCn9cERCXaob
S9JUVWWFG0To2D9oOhwFrFXPP/clyuzEMDEzhpMKqpMl+bd3mnTBqTgR0k0q+S46
jOm3yBSCj55YoW6/DHKLTHdfOnqqWmwwfcBmCYL/qnGtAsJt0p961Zh3fqHrLAML
oWBkrifXqfPjDB1yYpMxfRAmQjP5NtOzV2A9m1H01Y+F1fBr0BYbgD+vLmoFVTn1
AkNrKwZvT0wpU7oLaSzXq7heP8hs8RixAEckRqve/xQMOPRpfuYYt/29DSghmbw=
=gl+D
-----END PGP SIGNATURE-----