[RFC] strawman explict object-tracking API

Martin Pool mbp at sourcefrog.net
Tue Apr 17 07:11:57 BST 2007


On 4/17/07, Robert Collins <robertc at robertcollins.net> wrote:
> So we have a connection cache in sftp which is undesirable because of
> its lack of a guarantee that we won't prompt users for credentials
> multiple times.

These changes would give us more explicit control of when connections
are opened and kept alive, which I agree is good especially for ssh.
Should this also encompass making sure connections close?

> We have a related issue with repositories: we oftimes want to share a
> repository between two branches - e.g. when doing 'update' in a branch,
> where the master is in the same shared repo, or 'pull' likewise. Theres
> absolutely no need for us to have two Repository objects, nor to pay the
> performance cost of opening the Repository twice etc.

and having two objects makes it easy to trip over ourselves trying to
lock both...

> We could put a list of registered objects into each domain - e.g. have
> 'bzrlib.transport._open_transports', but I think this is pure ugly: Its
> not discoverable to users of the api, its not good for threaded apps
> like servers, it makes testing complex.

I like the general idea of both allowing things to be recycled/reused,
and of making it not totally implicit/automatic.  It looks like in
normal use people would want things to always be reused within a
particular context - the whole process for the cli, one test, one
client connection in the server, something like that.

> Instead, I propose that we create a OpenFoo series of classes that track
> open objects for when we are using multiple objects.
>
> e.g. get_transport becomes
> get_transport(url, open_transports=None):
>   if open_transports and is_absolute(url):
>     # give each open transport the chance to own 'url' if it can.
>     for transport in open_transports:
>         try:
>             return transport.clone(url)
>         except InvalidURLJoin:
>             pass
>   # existing get_transport code
>
> and similar for:
> Branch.open
> Repository.open
> WorkingTree.open
> BzrDir.open*
>
> Finally apis like 'Branch.update' which will open another branch
> *sometimes* will also accept such an object. Possibly we should allow
> Branch objects to carry a referene to OpenBranches for the users state,
> which would reduce the number of apis that need changing to benefit,
> without making it magic and interfering with test isolation, servers,
> etc.
>
> I think we can do a reasonable job of this api using just dictionaries,
> but because there will likely be common code we want to factor out we
> should use a userdict from the get-go.

Basically like the idea but I have some thoughts:

Why a dictionary rather than just a list?  What would the key be?

I'm not so sure this thing wants to support the whole dictionary
protocol and should be a userdict.

It seems to me it would be a better fit to actually the 'recycler'
object and say "please get me a transport", and then it knows how to
ask each of the objects that are available for reuse.

Possibly this should be split out into a more general Session object
that holds all this per-context state.  get_transport and similar
top-level factories would then go off that object.  That may be an
over generalization.  I'm not sure you would want the lifecycle of
reuse of all these objects to be the same.

I'm not so keen on "OpenBranch" because "open" is ambiguous as an
adjective or verb, and it seems to describe the state of the branch
itself, and because these things may be useful even when the actual
connection is no longer open.

To implement these generically we seem to need two interfaces, something like

Reusable:

  reuse(args, kwargs):
     Try to reuse this object instead of constructing a new object
with the given constructor arguments.  This can return self, or a new
object based on self, or None if it's not possible.

ReusePool:

  create_or_reuse(args, kwargs):
     Used in place of constructing a new object with the given parameters.
     Either return an appropriate reused object, or construct and
remember a new one.

  register(obj):
     Remember an object as a possible source of reuse.

  deregister(obj):
     ...

Maybe there's an existing pattern name for this?

This could perhaps be used by transports to cache network connections,
and maybe also to remember credentials (eg for http).

-- 
Martin



More information about the bazaar mailing list