any more sftp fixes?

Fri Dec 2 14:34:16 GMT 2005

Jan Hudec wrote:
> On Thu, Dec 01, 2005 at 20:53:24 -0600, John A Meinel wrote:
>> Robey Pointer wrote:
>>> On 1 Dec 2005, at 5:46, John A Meinel wrote:
>>>
>>>> Robey Pointer wrote:
>>>>> I'm not sure the weakref system will work, though -- I'd pondered that
>>>>> before, but in tests with the python CLI, the weak values were going
>>>>> away as soon as the last "real" reference to them died.  If that's true
>>>>> generally, then the cache has an automatic expiration of zero seconds,
>>>>> making it a lot less useful. :)
>>>>>
>>>> The weakref was mostly a starting point. We had some problems where
>>>> there were 2 paths to create a connection, and they weren't getting
>>>> shared.
>>>>
>>>> Long term, I wanted bzrlib to keep some sort of reference, so that a
>>>> front end could implement a longer-term caching policy. Such as keeping
>>>> a timeout list so that after a period of inactivity (say 1min/60min
>>>> whatever), they would be closed.
>>>>
>>>> Adding the weakref dictionary just means that bzrlib won't try to hold
>>>> onto them forever, but gives a place for other front-ends to acquire
>>>> them if it decides to create a different policy.
>>> I wonder if this maybe should be implemented at the Transport level,
>>> instead of per-transport (especially after hearing that FTPTransport is
>>> doing something similar).  Is there any transport that would be hurt by
>>> such caching?
>> Because I want to cache at the connection level, not at the URL level.
> 
> Right. So what about having a base class, say ConnectionRegistry, that
> all connection objects would inherit and that would cache them by their
> __init__ argument tuple? The transports would then have to explicitly
> use it, but the rest would be taken care of for them.

Something like this might be useful. I don't know that it has to be a
full object. Even just a blind:

def cache_connection(connection, info):

def get_connection(info):

> 
>> So if I connect to "sftp://user@host/something" it should reuse that
>> connection when I connect to "sftp://user@host/somethingelse/"
> 
> But there is a connection object, that is passed only hostname and
> username. That object would be cached, based on it's constructor
> arguments.
> 
>> So unless you require that all Transports fit exactly the URI spec with
>> user at host, then it needs to be done by each individual transport.
>>
>> That, and LocalTransport doesn't really benefit. And I'm not sure how
> 
> It does not have a connection, it wouldn't use it. Right.
> 
>> HttpTransport would benefit.
> 
> It sure could, as HTTP/1.1 supports keepalive. Server-side timouts have
> to be taken care of though.
> 
> By the way, it's quite common for http transport to connect to a proxy.
> In which case it has only one connection even when accessing multiple
> servers.
> 
> Well, there is another level. While you are not likely to want multiple
> connections to the same URL, mutliple connections to the same server and
> user MIGHT be desired. You may want to parallelize access to two
> different repositories even when they are on the same server.
> 

Yes, but do you want parallel, or just interwoven? Certainly for
something like SSH which can handle pipelining (and HTTP when it is in
the right mood), if you are maxing your bandwidth, it doesn't help you
to go parallel. And if latency is the problem, if you are pipelined it
should be interweaving the requests. (Obviously if it is hanging and
waiting for a response each time, you would have problems that having 2
threads hanging would do better, but currently we are still in a
single-threaded model).

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051202/438cc070/attachment.pgp