1.6 fetch regression

Thu Aug 28 01:38:53 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

>> Over the local network I also see:
>> $ time bzr.dev branch http://
>>   35.41s user 3.40s system 69% cpu 55.926 total
>> $ time bzr1.5 branch http://
>>   38.61s user 3.30s system 69% cpu 1:00.54 total
>>
>> (note that this repo is slightly different, but still a packed repo) I'm
>> actually quite surprised to see that bzr-1.5 branching over http:// is
>> *faster* than branching locally. (1m versus 1m15s).
> 
> async behaviour in pycurl, I'd guess.
> 
> -Rob

I should also mention that I did a bit of lsprof testing, and I see a rather
large amount of time spent in:

_StatefulDecoder.accept_bytes

And I was able to track it down to the line:

  self._in_buffer += bytes

My best guess is that we are getting some really long buffered bytes, and the
cost of reallocating that is costing us a lot.

I've been playing with refactoring it into a
"self._in_buffer_list.append(bytes)" and then collapsing at need. But it turns
out that

''.join(self._in_buffer_list) then shows up as the hot spot.

Right now I'm trying to decide if this is just the way it has to be, or if the
calling code is only consuming a portion of this buffer. Or if we are adding 2
bytes to a 10MB buffer, etc.

And when I say "hot spot" it is something like 13% of the total runtime is in
2000 calls to ''.join().

37% of the total runtime is spent in ReadVReader._next(), which amounts to 15%
spent in read_body_bytes, (and probably the other ~30ish percent is breaking
that string back up into its constituent parts.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFItfOdJdeBCYSNAAMRAuiEAKDVMq8jLxvytKZI9FTtApXH/kfCugCaAguf
Gg6b3aSm55A0/V1rV70JPjI=
=/HP4
-----END PGP SIGNATURE-----