backward diffs in knits?

Matthieu Moy Matthieu.Moy at imag.fr
Tue Apr 10 14:50:11 BST 2007


Aaron Bentley <aaron.bentley at utoronto.ca> writes:

> Matthieu Moy wrote:
> [...]
>
> So let's say our interval for snapshots is 10, and we're looking at a
> range of 15 revisions.
>
> In 1, we have ABCDE
> In 2, we have ABCDEFGH

That seems to me to be a very special case. Let's take another:

in N, we have X^N (that is, X repeated N times).

> All the other revisions are no-ops.
>
> For a forward delta, this produces
> A snapshot of "ABCDE" at 1
> An insert of "FGH: at 2
> A snapshot of "ABCDEFGH" at 11

A snapshot of "X" at 1
An insert of "X" at n \in 2..10
A snapshot of XXXXXXXXXXX at 11
An insert of "X" at n \in 12..15

> For a backward delta, this produces
> A snapshot of "ABCDEFGH" at 15
> A snapshot of "ABCDEFGH" at 5
> A deletion at 1

A snapshot of X^15 at 15
A snapshot of X^5 at 5
A deletion at n \in 1..4, 6..14

> You can extend this example in either direction, and backwards deltas
> retain their three-letter advantage (but the relative advantage decreases).

The relative advantage decreases _if_ you extend the example with
no-ops. But if you extend it with insertion, the relative example
should be constant (proportional to the snapshot interval).

> It seems like there's some win, but I'm not sure whether it's enough to
> justify a "repack" command.  

Probably not in itself.

Indeed, what I'd like to see is the equivalent of git's pack files,
which would provide a way to get a fast initial checkout over a dumb
protocol.

>> IIRC, CVS has the last revision as a fulltext snapshot, and the others
>> as backward delta.
>
> That matches what I've heard.  The fact that CVS does something is not a
> positive recommendation to me.

For sure ;-). I'm not recommending to do _exaclty_ what CVS does, but
to look at it to get an idea, nothin more.

> CVS's approach does give you fast access to recent revisions, but
> because there are no intervening snapshots, old revisions are more
> expensive to retrieve. If a recent revision is damaged, you can lose
> access to all prior history.  And of course, this approach isn't
> append-only, which makes it more prone to corrupting old revisions.

100% agree. The way CVS did backward patching was not good (even
performance-wise, the fact that it did _only_ backward patching forced
it to re-write the complete file each time). But that doesn't
necessarily make the idea itself bad.

-- 
Matthieu



More information about the bazaar mailing list