reintroducing root ids

Sun Feb 12 22:54:55 GMT 2006

Robert Collins wrote:
> On Wed, 2006-02-08 at 14:39 +1100, Robert Collins wrote:
>> So, two concepts now - repo scaling and nested trees have +1s on using
>> root ids.
>>
>>
>> Now we need an algorithm to sanely introduce them to existing trees, and
>> generate them in new trees.
>>
>> I kinda like Aarons 'add them when you commit with the parent being the
>> NULL_REVISION' concept.
>>
>>
>> Related to this is the root id to use in arch conversions - if you
>> convert two related branches independently we should end up with the
>> same root id.
>>
>>
>> I dont have a proposal to make, other than that we figure this out :)
> 
> Ok, based on discussions with Martin (on the phone) and other comments
> on list, how about the following proposal:
> 
> All ideas are other peoples, all mistakes are mine ;)
> 
>  * Revisions get a root_id property.
>  * There is a branch format change to introduce this, and you cannot
> pull any data from a newer format branch which has this to an older
> branch that does not. [this trapdoor is needed to avoid data-losing
> loops].
>  * For existing revisions, as they are written into the new format
> branch, or on the fly if needed, we follow the left most ancestor all
> the way up.
>  * For baz conversions we can use the log files in the branch to help.
>  * When we detect the results from differing conversions, we take the
> value of the conversion that had more history, and rewrite the
> repository as needed. This should happen extremely rarely, and by taking
> the long view each time this will result in convergence.
>  * We accept that we are creating inventories and revisions that have
> the same id and different value in different repositories *in this
> specific case*, but as we can detect it and correct it, we do so. That
> is, if during a pull operation we see a revision we dont have but do
> have a common ancestor for, that has a root id which is the same as a
> revision that that repository does have then we know that its a
> converted revision from before root ids existed, and it should have the
> same root id as we did for the common ancestor. We can then review the
> graphs and pick which common ancestor should have won and either
> translate as we read from that repository, or rewrite ours to take their
> value.
>  * We write a 'reconcile' command to trigger manual reconciliation (i.e.
> if you dont want to merge or pull from a repo, how do you get this to
> converge). This would replace reweave and fetch-missing as a single
> repair tool.
>  * We special case 'diff' and 'status' with the following heuristic for
> all directories [well, maybe just the root, but hey]. When diffing A and
> B, if there is a directory path with id X in A, and X is missing from B,
> and a directory path with id Y in B, and Y is missing from A, treat X
> and Y as aliases to each other. Note that this is a variation on
> inventory id aliasing which is a more general solution, but post 1.0
> IMO.
> 
> 
> I can't think of any robust solution that does not involve either a
> forced, one time pull of ALL the data into a single repository for
> upgrading, or accepting that variation between these old revisions can
> occur. Manually specifying an id is not robust - mistakes lead to
> variation between revision representation; forced upgrades are not
> robust and are extremely unfriendly to the user.
> 
> Rob

I think you've thought it through thoroughly. +1

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060212/ec30191e/attachment.pgp