[MERGE] fetch to --dev6-rr should not use deltas

Robert Collins robert.collins at canonical.com
Wed Apr 8 23:01:34 BST 2009


On Wed, 2009-04-08 at 11:53 -0500, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Here is a small change that makes converting to --dev6-rich-root work
> (again) for me.
> 
> === modified file 'bzrlib/repofmt/groupcompress_repo.py'
> - --- bzrlib/repofmt/groupcompress_repo.py        2009-04-08 16:33:19 +0000
> +++ bzrlib/repofmt/groupcompress_repo.py        2009-04-08 16:47:40 +0000
> @@ -972,7 +972,7 @@
>      # multiple in-a-row (and sharing strings). Topological is better
>      # for remote, because we access less data.
>      _fetch_order = 'unordered'
> - -    _fetch_uses_deltas = True # essentially ignored by the
> groupcompress code.
> +    _fetch_uses_deltas = False # essentially ignored by the
> groupcompress code.
>      fast_deltas = True
> 
>      def _get_matching_bzrdir(self):
> 
> (sorry about the wrapping.) The code in question actually has a comment
> about this:
> 
>     # Note: We cannot unpack a delta that references a text we haven't
>     # seen yet. There are 2 options, work in fulltexts, or require
>     # topological sorting. Using fulltexts is more optimal for local
>     # operations, because the source can be smart about extracting
>     # multiple in-a-row (and sharing strings). Topological is better
>     # for remote, because we access less data.
>     _fetch_order = 'unordered'
>     _fetch_uses_deltas = False # essentially ignored by the
> groupcompress code.
> 
> 
> Groupcompress ignores deltas, so setting this flag only causes us to
> fetch data from non-gc formats in fulltexts.
> 
> So why do we need this
> 
> 1) The 'adapter()' code uses the *target* repository to convert a delta
> into a fulltext.
> 2) If you fetch 'unordered', then you can get a delta for a file before
> you get the content of its parent.
> 
> So for conversions, we can either chose to set _fetch_order =
> 'topological', or we can set _fetch_uses_deltas = False. Since
> Groupcompress repositories ignore _fetch_uses_deltas, it seemed the
> safer thing to set.
> 
> Another possibility would be to change the code to buffer these objects
> and/or adapt them using the source, rather than the target. However, the
> streaming apis pretty much declare that you have to finish the whole
> stream before you could request more data, so the length of time we
> would buffer is a bit long.

Ok. The comment was pretty much undecipherable, which is why I changed
the line and noted that gc doesn't use it. There are tests that test it
is set to True. And setting it to False will cause a lot more network
traffic on large files than we really need.

We still need to do skinny deltas in groupcompress, and when we do this
will become more relevant still. The behaviour we want then, is:
 - deltas are only sent when the fulltext was sent earlier.

And really has nothing to do with using deltas per-se.

We can set topological here. Or we can even set groupcompress ordering.

I suggest that what we should do is:
 - set groupcompress ordering
 - set use_deltas to True (which means that we will be sent deltas the 
   source believes we can use).
 - change the rule for groupcompress ordering to include the
   requirement that deltas come after their basis always.

The net effect of this will be:
 - fetching from pack repositories will be somewhat toplogical - we'll
   get long runs (delta length) in topological order, but these runs 
   themselves will be grouped by fileid and triggered by traversing
   the revision ids in reverse topological order.
 - fetching from groupcompress repositories will be in optimal
   groupcompress order

We can set use_deltas to False for 1.14, but we shouldn't leave it that
way in bzr.dev.

-Rob

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090409/3ffdbba7/attachment.pgp 


More information about the bazaar mailing list