[MERGE] fetch to --dev6-rr should not use deltas
John Arbash Meinel
john at arbash-meinel.com
Wed Apr 8 23:27:28 BST 2009
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
...
>
> Ok. The comment was pretty much undecipherable, which is why I changed
> the line and noted that gc doesn't use it. There are tests that test it
> is set to True. And setting it to False will cause a lot more network
> traffic on large files than we really need.
>
> We still need to do skinny deltas in groupcompress, and when we do this
> will become more relevant still. The behaviour we want then, is:
> - deltas are only sent when the fulltext was sent earlier.
>
> And really has nothing to do with using deltas per-se.
>
> We can set topological here. Or we can even set groupcompress ordering.
>
Or we can just tell InterDifferingSerializer to *always* use topological
ordering. I'm not completely convinced that always adapting in the
target is the best answer.
Consider that when extracting from pack-0.92 and knits, they already
have a lot of code to do efficient extraction of multple fulltexts from
a given KnitDeltaClosure. Such that even memory is saved by sharing the
lines between different texts (at least until they are joined together
to create a single fulltext string).
> I suggest that what we should do is:
> - set groupcompress ordering
> - set use_deltas to True (which means that we will be sent deltas the
> source believes we can use).
> - change the rule for groupcompress ordering to include the
> requirement that deltas come after their basis always.
>
We can do that, though it is going to cost a lot more during the
'sort_groupcompress' stage. As a start, the groupcompress sort works
purely on the parent_map, and doesn't know about the build chain at that
time. We can teach it, I'm mostly just mentioning that the scope of this
change is not a simple "change the sort algorithm".
> The net effect of this will be:
> - fetching from pack repositories will be somewhat toplogical - we'll
> get long runs (delta length) in topological order, but these runs
> themselves will be grouped by fileid and triggered by traversing
> the revision ids in reverse topological order.
> - fetching from groupcompress repositories will be in optimal
> groupcompress order
>
> We can set use_deltas to False for 1.14, but we shouldn't leave it that
> way in bzr.dev.
>
> -Rob
My other big concern for groupcompress ordering is that it isn't
'stable', as it depends on what order things are yielded from a dict(),
which will depend on how big the dict is, what python version is used,
etc. Which means that just because the source is in groupcompress order,
doesn't mean that it will fetch in purely optimal order.
The nice thing about using 'unordered' is that after running 'bzr pack',
or even just an auto-pack, you are guaranteed to get the results on a
group-by-group basis, which should be fairly optimal for both sides.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkndJNAACgkQJdeBCYSNAAN98wCghpbWiWZTP02Z7EeGeFgvnaX2
E/EAoNoBzzVyc1mFJq0TKxqW9XLFUt15
=P/9z
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list