Mutating history in Subversion and Bazaar

David Allouche david at allouche.net
Thu Aug 31 16:06:44 BST 2006


Aaron Bentley wrote:
> In order for it to be corrupt here, it needs to have a copy of C whose
> parent is B0.

Interesting, more on that below.

>  (But it doesn't need to be stored in a branch, just
> somewhere in the repository.)

Right, which makes it even worse. If I was saying "branch" it was in an
attempt to keep the story readable, which was apparently not entirely
futile since at least you managed to read it :)

> Having the old C would prevent the new C from being installed, so you'd
> be stuck with the original history.  We don't apply deltas to produce
> trees, so the storage wouldn't be corrupted in the way you're thinking.
>  In a way, it's worse than Arch, because you may *never* find out that
> there are two different versions of C running around.

Since knits are delta-compressed and usable in append-only mode, you
still have to apply deltas to extract user data, don't you?

I guess that what you mean is that since knits deltas are not reversible
(I guess), then changes in context are not a cause for failing to build
the document text.

>>> You only need to have pulled /at some point/ into your repository a
>>> branch that contains B0 to contaminate all the branches that use that
>>> repository, in ways that may not be immediately obvious. This problem is
>>> non-existent in Subversion because checkouts do not duplicate the
>>> repository.
> 
> It's not B0 that's the problem.  The meaning of B0 was never altered.
> The problem is that C's value has been changed, but the two Cs are
> indistinguishable.

That's a very interesting point. If I understand correctly, the problem
 here is the installation of C based on B0 while C is defined relative
to B1.

Wouldn't it be possible to catch such violations at fetch time? Maybe
using some logic like:

 * When fetching a group of revisions, get the revision ids of the
parent revisions that are not part of the group.
 * For each of those revision id, assert that the testament (or its
hash) in the source repository is identical to the one in the target
repository.
 * If the revision id is absent in both repositories, we are installing
a new ghost. Do the parent revision hash matching in individual knits.
When the parent text id in a knit is not present, just copy the
full-text revision (we are not generating any invalid data by doing so).
 * If the revision id is absent in source and present in the target, do
as in the previous case.
 * If the revision id is present in the source and absent in the target,
then we have a logic error because we should be fetching that parent as
well.

This logic is probably flawed in many ways, because I do not understand
the knits storage model well, but I hope it helps convey my point.

In summary: the fetcher should fail at installing a text revision or
revision object when a discrepancy can be found between the contents of
 available parents. When no discrepancy can be found by lack of data
(because of ghosts) we can safely install the needed full texts, as this
preserves the meaning of the fetched data.

There should also be a converse sanity check when filling ghosts. That
might force us not to fetch all the available data in border cases.

How does that sound?

> I still like this quote, though:
> 
> < ddaa> pmw: in a distributed system, decisions like that affect the
>         rest of the world. It's not about just shooting one's foot off.
>         It's about shooting the feet of everybody on the planet wearing
>         a specific brand of shoes.

Thank you, I love when people quote me :)

Violating the integrity of a distributed database is certainly not a
nice thing to do, but I hope that we can find a way to control the
splash damage enough to make transparent interoperability with other
systems a reliable proposition.

-- 
                                                            -- ddaa

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060831/c8b5a9a1/attachment.pgp 


More information about the bazaar mailing list