[MERGE] fetch to --dev6-rr should not use deltas
Robert Collins
robert.collins at canonical.com
Wed Apr 8 23:32:14 BST 2009
On Wed, 2009-04-08 at 17:27 -0500, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> ...
> >
> > Ok. The comment was pretty much undecipherable, which is why I changed
> > the line and noted that gc doesn't use it. There are tests that test it
> > is set to True. And setting it to False will cause a lot more network
> > traffic on large files than we really need.
> >
> > We still need to do skinny deltas in groupcompress, and when we do this
> > will become more relevant still. The behaviour we want then, is:
> > - deltas are only sent when the fulltext was sent earlier.
> >
> > And really has nothing to do with using deltas per-se.
> >
> > We can set topological here. Or we can even set groupcompress ordering.
> >
>
> Or we can just tell InterDifferingSerializer to *always* use topological
> ordering. I'm not completely convinced that always adapting in the
> target is the best answer.
IDS won't trigger over the network; I think using IDS for this is a bug,
and want us to be working to eliminate it.
> Consider that when extracting from pack-0.92 and knits, they already
> have a lot of code to do efficient extraction of multple fulltexts from
> a given KnitDeltaClosure. Such that even memory is saved by sharing the
> lines between different texts (at least until they are joined together
> to create a single fulltext string).
Sure, but it will still grab all of them, as I noted.
>
> > I suggest that what we should do is:
> > - set groupcompress ordering
> > - set use_deltas to True (which means that we will be sent deltas the
> > source believes we can use).
> > - change the rule for groupcompress ordering to include the
> > requirement that deltas come after their basis always.
> >
>
> We can do that, though it is going to cost a lot more during the
> 'sort_groupcompress' stage. As a start, the groupcompress sort works
> purely on the parent_map, and doesn't know about the build chain at that
> time. We can teach it, I'm mostly just mentioning that the scope of this
> change is not a simple "change the sort algorithm".
Right, I know, which is why I suggest we don't do this for 1.14.
>
> > The net effect of this will be:
> > - fetching from pack repositories will be somewhat toplogical - we'll
> > get long runs (delta length) in topological order, but these runs
> > themselves will be grouped by fileid and triggered by traversing
> > the revision ids in reverse topological order.
> > - fetching from groupcompress repositories will be in optimal
> > groupcompress order
> >
> > We can set use_deltas to False for 1.14, but we shouldn't leave it that
> > way in bzr.dev.
> >
> > -Rob
>
> My other big concern for groupcompress ordering is that it isn't
> 'stable', as it depends on what order things are yielded from a dict(),
> which will depend on how big the dict is, what python version is used,
> etc. Which means that just because the source is in groupcompress order,
> doesn't mean that it will fetch in purely optimal order.
>
> The nice thing about using 'unordered' is that after running 'bzr pack',
> or even just an auto-pack, you are guaranteed to get the results on a
> group-by-group basis, which should be fairly optimal for both sides.
I think it would be relatively easy to make it stable; using a list
rather than a dict to hold things, and sort() each group of things we
have to add to the queue.
-Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090409/e3be218a/attachment-0001.pgp
More information about the bazaar
mailing list