Mutating history in Subversion and Bazaar

David Allouche david at allouche.net
Thu Aug 31 18:40:17 BST 2006


Aaron Bentley wrote:
> David Allouche wrote:
>>> Since knits are delta-compressed and usable in append-only mode, you
>>> still have to apply deltas to extract user data, don't you?
> 
> Knits are not part of the model.  The model is full-tree snapshots, and
> knits are mainly an implementation detail.
> 
> This means that if you use a dumb fetcher that works at the model level,
> what I said above holds true.

Mh. Right, something like the initial full-text store. In that case the
model violation is not such a big deal pragmatically, because it gets
buried in the past. But of course, it's much better to avoid it :)

> But if you use a clever fetcher that works by slinging knit deltas
> around, then yes, it's conveivable to corrupt the knit.  Knits store
> sha1 hashes, so the corruption would be easy to detect.
> 
> I don't know whether we check sha-1's when copying deltas from one knit
> to another.  We could do that, or we could make sure that the sha-1 of
> the parent in the target knit matches the sha-1 of the parent in the
> source knit.  So for knits, the knit itself contains enough data to
> verify that you're not creating a version that cannot be constructed.

I love this feeling of meeting of minds when I have design discussions
with you :)

>>> Wouldn't it be possible to catch such violations at fetch time? Maybe
>>> using some logic like:
> 
> [snip]
> 
>>> This logic is probably flawed in many ways, because I do not understand
>>> the knits storage model well, but I hope it helps convey my point.
> 
> I think it sounds pretty good.  Unfortunately, to be really sure there
> are no discrepancies between two repositories, you have to compare every
> common revision, because the discrepancy may be some time long in the past.

Pardon this analogy, but I think of that feature more as a condom than
as a blood test. If bzr prevents inconsistent data from getting mixed,
you can guarantee consistency by regression ad infinitum.

Of course, it would probably be useful to have an exhaustive validation
command (blood test), at least to diagnose problems and help hunting
down repositories with "bad" data.

>>> Violating the integrity of a distributed database is certainly not a
>>> nice thing to do, but I hope that we can find a way to control the
>>> splash damage enough to make transparent interoperability with other
>>> systems a reliable proposition.
> 
> I think we can prevent splash damage.  I'm not sure what we do when
> we've discovered that a revision's data is inconsistent.

What I would like bzr to do:

 * Cry loudly when it finds an inconsistency.
 * Refuse to mix inconsistent data in the same repository.

That would be enough to prevent undefined behavior and model violations.
 After that, it's up to user communities to pick the "good" ancestry.
Outstanding work could be ported across using diff+patch or bzr graft.

Of course, that would not prevent mixing various incompatible history
fragments in the same repository (because of ghosts), but the condom
would prevent a repository from becoming internally inconsistent.

-- 
                                                            -- ddaa

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060831/9f68d2c9/attachment.pgp 


More information about the bazaar mailing list