[MERGE] A hack to make urllib not call recv(1) lots and lots.

Robert Collins robertc at robertcollins.net
Mon Mar 17 04:55:53 GMT 2008


On Sun, 2008-03-16 at 23:26 -0500, Andrew Bennetts wrote:
> SuperMMX wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > Hi, Andrew Bennetts <andrew at canonical.com> :
> > 
> > On Sun, 16 Mar 2008 14:56:58 -0500
> > Andrew Bennetts <andrew at canonical.com> wrote:
> > 
> > > Pavel Pergamenshchik at PyCon noticed that "bzr branch http://..." on a large
> > > branch was using lots of CPU, and strace showed that it was making a huge number
> > > of recv(1) calls to read bytes off the socket one byte at a time.  It seems this
> > > is caused because httplib's HTTPResponse doesn't specify a buffer size for the
> > > fileobject it constructs from its socket, and it doesn't provide a way to
> > > override this.
> > > 
> > > This patch is a hack to fix this.  It's a bit dirty, but it massively reduces
> > > the number of recv calls bzr makes with urllib.
> > 
> > Here is the rough number on Windows, without the patch, the CPU usage is
> > about 45%, but with the patch, the CPU usage never exceed 10%
> > 
> > Any way to get the precise number ?
> 
> Interesting.  In fact that's pretty surprising now that I've timed the results
> on my laptop.
> 
> FWIW, I just did a timing locally on my laptop.  Without the hack, it takes me
> 3m 15s to branch Twisted trunk out of my bzr-svn import of it (78M of history,
> using Apache as the server).  With the hack, it takes 3m 9s.  This is the branch
> that Pavel originally noticed the excessive recv calls on.
> 
> I did take some care to make sure the cache was hot for both runs, but this is
> still a pretty small difference, probably still within the natural variation on
> a not totally quiescent laptop.  In fact, branching direct from disk took 4m 22s
> (much longer!), so something is definitely weird.  Anyway, the timings don't
> suggest to me that the risk of a nasty hack is worth it for such an uncertain reward.
> 
> So,
> bb:reject
> 
> But if it turns out to make a big improvement on Windows, I'd be tempted to
> reconsider.

I would say a 35% cpu reduction is significant :)

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080317/6c8f9891/attachment-0001.pgp 


More information about the bazaar mailing list