[MERGE][1.6.1] 1.6 fetch regression

Matt Nordhoff mnordhoff at mattnordhoff.com
Fri Aug 29 08:47:56 BST 2008


John Arbash Meinel wrote:
> John Arbash Meinel wrote:
>> Sending again because I forgot to say "merge".
> 
>> John Arbash Meinel wrote:
>>> John Arbash Meinel wrote:
>>> ...
>>>> My best guess is that we are getting some really long buffered bytes, and the
>>>> cost of reallocating that is costing us a lot.
>> This ended up being a (the?) major factor.
>> Specifically, in the StatefulDecoder we would buffer up the bytes by doing:
> 
>> self._in_bytes += bytes
> 
>> Which would cause us to reallocate large (MB+) strings.
> 
>> And then in the readv parser, we would do:
> 
>>  data = data[consumed:]
> 
>> Which would cause us to downsize the strings. Notice, though, that it requires
>> 2x memory while doing it, because it has to create a new string, copy the
>> bytes over, and then destroy the original.
> 
>> My initial work on fixing the "input" side failed, because we peek many times
>> at the string while waiting for more bytes to come. But I've since worked
>> around that.
> 
>> The attached patch drops the time for "bzr branch bzr+ssh://localhost/bzr.dev"
>> from 1m26s to 39s.
> 
>> $ old
>> VmPeak:	  202736 kB
>> bzr -Dmemory branch  xxx  54.60s user 30.11s system 96% cpu 1:27.58 total
>> $ new
>> VmPeak:	  181136 kB
>>  33.64s user 0.84s system 91% cpu 37.640 total
> 
>> Notice the *huge* difference in "system" time.
> 
> 
> Attached is a slight update. I had left some "assert" statements in the code,
> and one other small test was failing.
> 
> I would also mention that for "bzr branch lp:mysql-server" it changed the time
> from 80+ minutes down to ~23minutes.
> 
> We are only really seeing this now because Robert's work made it easier for us
> to talk about more data at a time, and this code path is probably at least
> O(N^2) by length. (If you read data at fixed size pages you have to read L/C
> pages, and you reallocate the buffer each time.)
> 
> I've decided to go ahead and do a few bug fixes and get them into a 1.6.1
> release. There are 2 "critical" bugs that will also make it into 1.6.1.
> 
> https://bugs.edge.launchpad.net/bzr/+bug/261339
> and
> https://bugs.edge.launchpad.net/bzr/+bug/262333
> 
> I'm hoping to get some patches reviewed and 1.6.1rc1 out tomorrow.
> 
> John
> =:->

Trivial (as always), but:

> === modified file 'bzrlib/transport/remote.py'
> --- bzrlib/transport/remote.py	2008-07-25 03:12:11 +0000
> +++ bzrlib/transport/remote.py	2008-08-28 21:27:48 +0000

> @@ -312,14 +312,13 @@

>                                 fudge_factor=self._bytes_to_read_before_seek))
>  
>          try:
> +            # if relpath.endswith('.pack'):
> +            #     import pdb; pdb.set_trace()
>              result = self._client.call_with_body_readv_array(
>                  ('readv', self._remote_path(relpath),),
>                  [(c.start, c.length) for c in coalesced])
> @@ -332,18 +331,38 @@

Should the pdb bit have been left in?
-- 



More information about the bazaar mailing list