[MERGE][1.6.1] 1.6 fetch regression
Matt Nordhoff
mnordhoff at mattnordhoff.com
Fri Aug 29 08:47:56 BST 2008
John Arbash Meinel wrote:
> John Arbash Meinel wrote:
>> Sending again because I forgot to say "merge".
>
>> John Arbash Meinel wrote:
>>> John Arbash Meinel wrote:
>>> ...
>>>> My best guess is that we are getting some really long buffered bytes, and the
>>>> cost of reallocating that is costing us a lot.
>> This ended up being a (the?) major factor.
>> Specifically, in the StatefulDecoder we would buffer up the bytes by doing:
>
>> self._in_bytes += bytes
>
>> Which would cause us to reallocate large (MB+) strings.
>
>> And then in the readv parser, we would do:
>
>> data = data[consumed:]
>
>> Which would cause us to downsize the strings. Notice, though, that it requires
>> 2x memory while doing it, because it has to create a new string, copy the
>> bytes over, and then destroy the original.
>
>> My initial work on fixing the "input" side failed, because we peek many times
>> at the string while waiting for more bytes to come. But I've since worked
>> around that.
>
>> The attached patch drops the time for "bzr branch bzr+ssh://localhost/bzr.dev"
>> from 1m26s to 39s.
>
>> $ old
>> VmPeak: 202736 kB
>> bzr -Dmemory branch xxx 54.60s user 30.11s system 96% cpu 1:27.58 total
>> $ new
>> VmPeak: 181136 kB
>> 33.64s user 0.84s system 91% cpu 37.640 total
>
>> Notice the *huge* difference in "system" time.
>
>
> Attached is a slight update. I had left some "assert" statements in the code,
> and one other small test was failing.
>
> I would also mention that for "bzr branch lp:mysql-server" it changed the time
> from 80+ minutes down to ~23minutes.
>
> We are only really seeing this now because Robert's work made it easier for us
> to talk about more data at a time, and this code path is probably at least
> O(N^2) by length. (If you read data at fixed size pages you have to read L/C
> pages, and you reallocate the buffer each time.)
>
> I've decided to go ahead and do a few bug fixes and get them into a 1.6.1
> release. There are 2 "critical" bugs that will also make it into 1.6.1.
>
> https://bugs.edge.launchpad.net/bzr/+bug/261339
> and
> https://bugs.edge.launchpad.net/bzr/+bug/262333
>
> I'm hoping to get some patches reviewed and 1.6.1rc1 out tomorrow.
>
> John
> =:->
Trivial (as always), but:
> === modified file 'bzrlib/transport/remote.py'
> --- bzrlib/transport/remote.py 2008-07-25 03:12:11 +0000
> +++ bzrlib/transport/remote.py 2008-08-28 21:27:48 +0000
> @@ -312,14 +312,13 @@
> fudge_factor=self._bytes_to_read_before_seek))
>
> try:
> + # if relpath.endswith('.pack'):
> + # import pdb; pdb.set_trace()
> result = self._client.call_with_body_readv_array(
> ('readv', self._remote_path(relpath),),
> [(c.start, c.length) for c in coalesced])
> @@ -332,18 +331,38 @@
Should the pdb bit have been left in?
--
More information about the bazaar
mailing list