fast delta generation in brisbane-core - advice/direction needed
Ian Clatworthy
ian.clatworthy at internode.on.net
Sun Mar 8 02:54:10 GMT 2009
John Arbash Meinel wrote:
>> But I fear that delta generation will always be slow on large
>> trees like OOo if the algorithm remains:
>
>> 1. get inventory
>> 2. get previous inventory
>> 3. calculate changes.
>
>> Thoughts?
>
>> Ian C.
>
>
>
> So, I don't know what numbers you were seeing, but I'm seeing a rather
> good improvement. Perhaps the bigger issues were the recent bugs that we
> fixed with _iter_nodes and the hash layouts?
I hope so. I'll retest now that some of the bugs have been removed.
> 'time bzr log -v --short -r -10..-1 mysql-525'
>
> For a 1.9 format repo, that was taking 2.0s, for a gc-chk255-bigpage
> repository, I'm seeing it as 0.7-0.8s. And keep in mind that is with a
> 'bzr rocks' time of 0.35-0.4s.
>
> Changing that to "log -v --no-aliases" was 15s for the chk branch, and
> 2m15s for the 1.9 format.
Awesome! That's very promising.
> I'm a bit concerned that the numbers are too high for LeafNodes, so we
> may need to investigate if iter_changes() is doing a good job of not
> descending into nodes it doesn't need to. I know the iter_changes()
> code is a bit complex, and I wonder if we could use the hash properties
> to simplify it a bit. (The tree is going to be a lot flatter, so we
> don't need to try hard to match up sha1's at different depths.)
I don't get when exclude_keys needs to be set. Otherwise the code
*looks* correct though it's non-trivial and could easily be hiding
subtle issues.
Robert and I chatted on the phone the other day about this and the
outcome was that I'd extend repodetails to collect the data about
the number of uncommon nodes found per revision walking history.
That way, you'd have more data to help select the optimal format.
The only catch is that I need to effectively write a clone of
iter_changes to get those numbers. Given the performance you're
seeing though, I'm much less concerned about the algorithm being
broken as I once was.
Ian C.
More information about the bazaar
mailing list