[MERGE] A hack to make urllib not call recv(1) lots and lots.

John Arbash Meinel john at arbash-meinel.com
Sun Mar 16 15:14:21 GMT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
> Pavel Pergamenshchik at PyCon noticed that "bzr branch http://..." on a large
> branch was using lots of CPU, and strace showed that it was making a huge number
> of recv(1) calls to read bytes off the socket one byte at a time.  It seems this
> is caused because httplib's HTTPResponse doesn't specify a buffer size for the
> fileobject it constructs from its socket, and it doesn't provide a way to
> override this.
> 
> This patch is a hack to fix this.  It's a bit dirty, but it massively reduces
> the number of recv calls bzr makes with urllib.
> 
> -Andrew.
> 
> 

Are we sure that this is safe to do on all platforms?

Also, why do you have to do both:
+            self.fp._rbufsize = 8192

and
- -            fp = socket._fileobject(r)
+            fp = socket._fileobject(r, bufsize=8192)

For example, on Windows I don't think you are safe creating the socket
in non-blocking and then doing a timeout on the read. (I forget the
specific combination, but it was something like in nonblocking you get
an instance exception that reading would block, so you have to try again
later.)

Anyway, I like the idea, I would be very curious to see the actual
effect of this.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFH3TlNJdeBCYSNAAMRAn9lAKC5X48oP7PFvPC0OVE8rtuVJCvUsACfXGsU
CVPWWD+vjPKXNxffQ5Of5d0=
=cz9T
-----END PGP SIGNATURE-----



More information about the bazaar mailing list