[MERGE] Solve reconciling erroring when multiple portions of a single delta chain are being reinserted.

Robert Collins robertc at robertcollins.net
Mon Dec 3 19:17:34 GMT 2007


On Mon, 2007-12-03 at 13:01 -0600, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Robert Collins wrote:
> > This fixes a bug in reconcile for packs.
> > 
> > This big comment explains it:
> > # We have to topologically insert all texts otherwise we can fail to
> > # reconcile when parts of a single delta chain are preserved intact,
> > # and other parts are not. E.g. Discarded->d1->d2->d3. d1 will be
> > # reinserted, and if d3 has incorrect parents it will also be
> > # reinserted. If we insert d3 first, d2 is present (as it was bulk
> > # copied), so we will try to delta, but d2 is not currently able to be
> > # extracted because it's basis d1 is not present. Topologically sorting
> > # addresses this. The following generates a sort for all the texts that
> > # are being inserted without having to reference the entire text key
> > # space (we only topo sort the revisions, which is smaller).
> > 
> > However its damn tricky to get a good test for this together, and I've
> > actually given up doing so. (I tried before I wrote the code).
> > 
> > This code is executed by the test suite, but reverting the patch won't
> > make tests fail :(.
> > 
> > -Rob
> > 
> > 
> - -        ideal_index = repo._generate_text_key_index(self._text_refs)
> +        ancestors = dict([(key[0], tuple(ref[0] for ref in refs[0])) for
> +            _, key, _2, refs in
> +            self.new_pack.revision_index.iter_all_entries()])
> +        ideal_index = repo._generate_text_key_index(self._text_refs, ancestors)
> 
> ^- We should try to avoid using '_' as the dummy variable, since it interacts
> badly with pygtk and i18n.
>
> Especially since I *think* bzr-gtk will start failing after encountering one of
> these lines.

Garh. _ is embedded in my skull as a common python convention for
throwaway variables. Is there a bug filed on pygtk about this? 
^ James ?

> Also, statements like: tuple(ref[0] for ref in refs[0]) is an indication that
> the return value from iter_all_entries() may be a bit hard to understand.
> 
> There wasn't any way that you could create a test which built up a few texts,
> and then processed them out of order? Is it just that the sort order is
> randomized (like through a dict or set)?

You have to have texts that are broken in the particular way described;
and *all* our code is engineered now to prevent that happening. In
particular commit builder is careful to get the parents right, and won't
use parents from a missing parent revision, both of which are required.
So you have to insert the entire set of things from first principle.

> ...
> - -        # 3) bulk copy the ok data
> +        # 4) bulk copy the ok data
>          list(self._copy_nodes_graph(ok_nodes, text_index_map,
>              self.new_pack._writer, self.new_pack.text_index))
> - -        # 3) adhoc copy all the other texts.
> +        # 4) adhoc copy all the other texts.
> 
> ^- you update the numbers, but you seem to have updated them to the same thing.
> Shouldn't it be either 3 & 4 or 4 & 5?

rofl. Oops :).

Thanks,
Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20071204/8c78d8eb/attachment.pgp 


More information about the bazaar mailing list