[MERGE] A hack to make urllib not call recv(1) lots and lots.
Robert Collins
robertc at robertcollins.net
Mon Mar 17 04:55:53 GMT 2008
On Sun, 2008-03-16 at 23:26 -0500, Andrew Bennetts wrote:
> SuperMMX wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Hi, Andrew Bennetts <andrew at canonical.com> :
> >
> > On Sun, 16 Mar 2008 14:56:58 -0500
> > Andrew Bennetts <andrew at canonical.com> wrote:
> >
> > > Pavel Pergamenshchik at PyCon noticed that "bzr branch http://..." on a large
> > > branch was using lots of CPU, and strace showed that it was making a huge number
> > > of recv(1) calls to read bytes off the socket one byte at a time. It seems this
> > > is caused because httplib's HTTPResponse doesn't specify a buffer size for the
> > > fileobject it constructs from its socket, and it doesn't provide a way to
> > > override this.
> > >
> > > This patch is a hack to fix this. It's a bit dirty, but it massively reduces
> > > the number of recv calls bzr makes with urllib.
> >
> > Here is the rough number on Windows, without the patch, the CPU usage is
> > about 45%, but with the patch, the CPU usage never exceed 10%
> >
> > Any way to get the precise number ?
>
> Interesting. In fact that's pretty surprising now that I've timed the results
> on my laptop.
>
> FWIW, I just did a timing locally on my laptop. Without the hack, it takes me
> 3m 15s to branch Twisted trunk out of my bzr-svn import of it (78M of history,
> using Apache as the server). With the hack, it takes 3m 9s. This is the branch
> that Pavel originally noticed the excessive recv calls on.
>
> I did take some care to make sure the cache was hot for both runs, but this is
> still a pretty small difference, probably still within the natural variation on
> a not totally quiescent laptop. In fact, branching direct from disk took 4m 22s
> (much longer!), so something is definitely weird. Anyway, the timings don't
> suggest to me that the risk of a nasty hack is worth it for such an uncertain reward.
>
> So,
> bb:reject
>
> But if it turns out to make a big improvement on Windows, I'd be tempted to
> reconsider.
I would say a 35% cpu reduction is significant :)
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080317/6c8f9891/attachment-0001.pgp
More information about the bazaar
mailing list