bzr log http+urllib does not work, http+pycurl is too slow

Wed Dec 12 14:41:15 GMT 2007

>>>>> "bialix" == Alexander Belchenko <bialix at ukr.net> writes:

    bialix> Vincent Ladeuil пишет:
    >>>>>>> "vila" == Vincent Ladeuil <v.ladeuil+lp at free.fr> writes:
    >> Of course I meant:
    >> 
    >> === modified file 'bzrlib/transport/http/_urllib2_wrappers.py'
    >> --- bzrlib/transport/http/_urllib2_wrappers.py	2007-12-06 22:46:16 +0000
    >> +++ bzrlib/transport/http/_urllib2_wrappers.py	2007-12-11 20:29:31 +0000
    >> @@ -134,7 +134,15 @@
    >> """
    >> if not self.isclosed():
    >> # Make sure nothing was left to be read on the socket
    >> -            data = self.read(self.length)
    >> +            pending = 0
    >> +            while self.length and self.length > 1024:
    >> +                data = self.read(1024)
    >> +                pending += len(data)
    >> +            if self.length:
    >> +                self.read(self.length)
    >> +                pending += len(data)
    >> +            if pending:
    >> +                trace.mutter('%s bytes left on the socket', pending)

    bialix> Yes, this helps. Now urllib works in the same manner as
    bialix> pycurl. And shows how many bytes left on the socket :-)

:-/

That, my friend, is some very bad news.... Well, the good news is
that the patch fixes the bug at least...

And sorry for asking again, but can you do that one more time
with -Dhttp so that I can diagnose more easily. There may be
several GET requests for one readv (that's should not be the case
here, but seeing the ranges requested may help evaluate the
wasted bandwidth :/ ).

We have a bad situation here, because even if the http transport
can reuse the whole file transferred inside one readv, it will
not be able to reuse that file *across* several readv if we don't
add a local cache (which we want to avoid for several reasons).

    Vincent