API Bulk Operations

Tue Jun 4 16:14:20 UTC 2013

On Wed, May 29, 2013 at 5:12 PM, Dimiter Naydenov
<dimiter.naydenov at canonical.com> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi guys,
>
> I decided to summarize what we discussed lately about API bulk
> operations support.
>
> 1. Why do we need this?
>
> We'll need bulk operations mostly for clients like the GUI or CLI
> commands. Good example is "juju status" - having to issue one multiple
> requests per entity is a bad design and won't be usable with large
> environments.

This is a strong assertion and I'd like to have seen it backed up with some
numbers. My concern about the outlined approach is
that it will be a considerable amount of work, resulting in a
more complex and less coherent API, for an amount of gain
that we don't actually know. I realise that things are already in motion
to do this, so my concerns are probably voiced too late.
I'll say this for the record though.

I can see that making all API operations act on a vector
of objects rather than individual objects will reduce the
overall number of requests to the API server, but I wonder
if the much simpler approach of simply sending many
requests concurrently might actually be fine in practice.

- the only agents that could currently benefit from bulk operations
are the provisioner, the firewaller and juju status, I think.
AFAICS even with the existing design, query speed is
not the bottleneck - transaction execution is, and I'm
not sure I see how adding bulk operations to the API
can help there.

- the current design does not by any means preclude
bulk operations. We already have AllMachines, for example.
We could easily add bulk operations as necessary
(as an optimisation step) rather than complicating the
entire API by fitting it to everything.

- vectorised operations are not necessarily sufficient - they
assume a request is parallelised across a single axis.

- concurrent request issuing is more general than vectorised
requests and can actually enable workflows that are difficult
to attain with vectorised requests (for example pipelining
of heterogeneous requests)

I like Ian's "coarse grained, business operations (defined as
verbs)" remark, but this leads to the question "how high level
do we want to go?". Why not have State.Provisioner and State.Firewaller
as API entry points, for example?

I'm afraid this feels like considerable churn for not much benefit.
We won't see any gain until both the agents and the state
code have both been changed to take advantage of the new
bulk operations. Once we've done that, I think a legitimate question
might be "why didn't we just add extra API calls tailored
to the operations we're trying to perform rather than make
*everything* bulk and have an extra layer that pretends
we're operating on single objects?".

My KISS sense is kicking in strongly on this one,
but since all is already agreed, I guess I can live with it.

  cheers,
    rog.