bzr log http+urllib does not work, http+pycurl is too slow

Wed Dec 12 14:48:26 GMT 2007

Vincent Ladeuil пишет:
>>>>>> "bialix" == Alexander Belchenko <bialix at ukr.net> writes:
> 
>     bialix> Vincent Ladeuil пишет:
>     >>>>>>> "vila" == Vincent Ladeuil <v.ladeuil+lp at free.fr> writes:
>     >> Of course I meant:
>     >> 
>     >> === modified file 'bzrlib/transport/http/_urllib2_wrappers.py'
>     >> --- bzrlib/transport/http/_urllib2_wrappers.py	2007-12-06 22:46:16 +0000
>     >> +++ bzrlib/transport/http/_urllib2_wrappers.py	2007-12-11 20:29:31 +0000
>     >> @@ -134,7 +134,15 @@
>     >> """
>     >> if not self.isclosed():
>     >> # Make sure nothing was left to be read on the socket
>     >> -            data = self.read(self.length)
>     >> +            pending = 0
>     >> +            while self.length and self.length > 1024:
>     >> +                data = self.read(1024)
>     >> +                pending += len(data)
>     >> +            if self.length:
>     >> +                self.read(self.length)
>     >> +                pending += len(data)
>     >> +            if pending:
>     >> +                trace.mutter('%s bytes left on the socket', pending)
> 
>     bialix> Yes, this helps. Now urllib works in the same manner as
>     bialix> pycurl. And shows how many bytes left on the socket :-)
> 
> :-/
> 
> That, my friend, is some very bad news.... Well, the good news is
> that the patch fixes the bug at least...

I prefer to pick the good news from the cake.

> And sorry for asking again, but can you do that one more time
> with -Dhttp so that I can diagnose more easily. There may be
> several GET requests for one readv (that's should not be the case
> here, but seeing the ranges requested may help evaluate the
> wasted bandwidth :/ ).

Just sent. Do you need similar log for unpatched bzr version? To see 
where traceback occurs.

> We have a bad situation here, because even if the http transport
> can reuse the whole file transferred inside one readv, it will
> not be able to reuse that file *across* several readv if we don't
> add a local cache (which we want to avoid for several reasons).

I understand situation, but I don't know all reasons why we avoid 
caching. I have no experience in this area, so I completely rely on your 
expertise here.

In my case dummy server is bad server. It's the server limitation, not 
bzr itself. I can live with this bad situation and my dummy Trac 
installation because I have option to run smart server and because I'm 
inside my own local windows network.

We need to file a bug though to at least be informed about its 
existence. How to fix it -- it's another story, IIUC.