API Bulk Operations

Thu May 30 07:58:02 UTC 2013

On Wed, May 29, 2013 at 6:12 PM, Dimiter Naydenov <
dimiter.naydenov at canonical.com> wrote:

> 3. What about error handling with bulk operations?
>
> This is a non-trivial issue, so needs to be defined. Consider the
> following operations:
>
> A. Get([id1, id2, ...]) -> []Result, error
> B. Set([[id1, params1], [id2, params2], ...]) -> []error
> C. Create([params1, params2, ...]) -> []Result, error
>
> A. Covers most read operations, like Machines.Get(ids),
> Machines.Status(ids), etc. We might get an error from state while
> getting some or all entities. We might have some more serious issue
> with the operation itself (i.e. state connection dropped at the
> beginning or half way in the operation). In the first case the final
> error is nil, because the operation succeeded, if partially, and
> returned a result. In the second case there's a non-nil error and no
> results. The results themselves have to be defined like this:
>
> type Result struct {
> Error
> // Other fields
> }
> Each result can contain an error for this specific result.
> The results have to be in the same order as the passed ids, so the
> client can easily and quickly lookup the result by index.
>

For consideration: there are, I think, several cases (eg, all the ones
we've mentioned so far) where there's only one sane per-entity error in a
bulk Get. In that case a Result-with-Error is more heavyweight than we
need: if we're getting, say, units, we could plausibly just return an
[]*Unit, with nil indicating NotFound. Any other error encountered in this
context is, I think, a signal of a bulk-level problem (say the API server
loses state conn half way through: that's analogous to an HTTP 500, and can
kill the whole operation).

Not every case will follow this pattern, but I don't think we should be
dogmatic about requiring result/error in every bulk-get result.

> B. Covers most update operations: "do this on a bunch of things and
> tell me if each one was successful". Again, the returned errors match
> in order and count the passed id/params pairs.
>

I think it's -> []error, error (in which individual errors are
differentiated from request-level errors). Sane?

> C. Covers a few operations, like AddMachine, AddUnit,
> unit.AssignToMachine. Same semantics as for A.: Result contains both a
> potential error and any other fields as needed; error is not nil only
> when the bulk operation cannot be performed as a whole.
>

I don't believe any of these cases are API-relevant. We never use the
results, and anyone who's interested in them should/will be running
appropriate watches. Right?

The Pinger case is a red herring because we shouldn't have exposed it in
the first place ;p.

> 4. If the agents are not using bulk operations, why should we care?
>
> Because supporting bulk operations on the API server-side requires
> changes to the protocol exposed over the websocket, and even though we
> can mask this for juju-core by keeping the client-side API the same,
> we can't for other potential clients, like the GUI.
> So we need to think how are we are going to implement it now, even if
> we won't need it immediately.
>

Apart from anything else, it's about expectation and consistency. There may
always be individual cases where all we ever need are single operations;
but by implementing everything as bulk by default, we maintain a sane
mindset (and pay a *very* low practical cost by passing a single-element
[]params rather than a single params over the wire). Conversely, if we make
a habit of implementing single-element calls, we'll implement
single-element calls for ones that really shouldn't be; but people will
start using them; and we'll end up carrying two APIs for the same purpose
to little benefit.

> 5. Can we have both bulk and single operations in the API?
>
> I don't see why all the operations *have* to be bulk. There are lots
> of examples of APIs like that out there, used a lot and their users
> are happy. Openstack comes to mind, also AWS.
> But having bulk operations in the first place brings up all the
> aforementioned things, which need to be resolved.
>

I think the cost of making everything consistently bulk is negligible
compared to the costs of having a mixture, and running the risk of getting
it wrong sometimes. A bulk op will always work with a single element, but
the converse does not hold.

Cheers
William
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju-dev/attachments/20130530/6bc897cd/attachment.html>