backward diffs in knits?

Aaron Bentley aaron.bentley at utoronto.ca
Tue Apr 10 15:42:09 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Matthieu Moy wrote:
> Aaron Bentley <aaron.bentley at utoronto.ca> writes:
> 
>> Matthieu Moy wrote:
>> [...]
>>
>> So let's say our interval for snapshots is 10, and we're looking at a
>> range of 15 revisions.
>>
>> In 1, we have ABCDE
>> In 2, we have ABCDEFGH
> 
> That seems to me to be a very special case.

Not really.  It's just looking at one set of lines, and seeing how delta
direction affects them.  My case is roughly a subset of yours.

> Let's take another:
> 
> in N, we have X^N (that is, X repeated N times).
> A snapshot of "X" at 1
> An insert of "X" at n \in 2..10
> A snapshot of XXXXXXXXXXX at 11
> An insert of "X" at n \in 12..15

For a grand total of 1 + 9 + 11 + 4 = 25 Xs

>> For a backward delta, this produces

> A snapshot of X^15 at 15
> A snapshot of X^5 at 5
> A deletion at n \in 1..4, 6..14

For a grand total of 20 Xs.

But if you look at it, I think you'll find the advantage is all due to
revisions 6+.  If we pretend the changes stop at 5:

forward:
A snapshot of X at 1
An insert of X at n in 2..5
A snapshot of "XXXXX" at 10

For a grand total of 10 Xs

backward:
A snapshot of "XXXXX" at 15
A snapshot of "XXXXX" at 5
A deletion a n in 1..5

For a grand total of 10 Xs

So the problem with my example is not that it selects only a small
change, it's that it assumes an unrealistically short history.  Most
revisions are book-ended with snapshots, and for those revisions, the
savings apparent in the 6..15 case apply.

>> You can extend this example in either direction, and backwards deltas
>> retain their three-letter advantage (but the relative advantage decreases).
> 
> The relative advantage decreases _if_ you extend the example with
> no-ops. But if you extend it with insertion, the relative example
> should be constant (proportional to the snapshot interval).

Right.

>> It seems like there's some win, but I'm not sure whether it's enough to
>> justify a "repack" command.  
> 
> Probably not in itself.
> 
> Indeed, what I'd like to see is the equivalent of git's pack files,
> which would provide a way to get a fast initial checkout over a dumb
> protocol.

Well, bundles actually provide that, or something close.  The issues are

1. Performance
2. Command tweakage.  (pull supports it, branch does not.  But "init;
pull" is roughly equivalent to "branch")

> But that doesn't
> necessarily make the idea itself bad.

Certainly not.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGG6JB0F+nu1YWqI0RAgh9AJ0UwlOR2ByWqnOIpqzwk3hd+rs7LQCeJM5z
qE/bYb0vNc1h9zPSkM9Yrtk=
=Vu86
-----END PGP SIGNATURE-----



More information about the bazaar mailing list