[MERGE] Reconcile can fix bad parent references

Robert Collins robertc at robertcollins.net
Tue Aug 28 01:05:38 BST 2007


On Sun, 2007-08-26 at 20:58 -0400, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1


> > This idiom:
> > Won't work with packs, because there is no write group. If you insert
> > repo.lock_write()
> > repo.start_write_group()
> > ...
> > repo.commit_write_group()
> > repo.unlock()
> > 
> > around your manual data insertion it will work fine.
> 
> I can do that, but I thought we wouldn't need this for packs.

In 'user space' we don't have to. That is for users of the modifying
primitives:
fetch
CommitBuilder
reconcile
check
gc
pack

However, folk getting right down under the hood and manually inserting
fragments of data do need to do this.


> > A note here that the revisions are checked first so that XXXX would be
> > good. I'm guessing that the reason is so that the data in the revision
> > index is trustable.
> 
> It is so that the revision_versions dict is pre-populated.

Cool - please note that so noone reverts this in future.


> > 
> > What do the changes to plan_revisions accomplish?
> 
> They reduce silly conversions.  In the old code, we start with a list,
> convert it to a set, and then convert it back to a list.
> 
> This just leaves it as a list.

Ok.

> > This uses the nodes-of-file graphs are nodes-of-revision graph property
> > that John and I are debating whether we can use - because the edges are
> > different.
> 
> > Until we decide that that will give identical results we
> > should stay with the per file graph.
> 
> If I didn't know the knit data was untrustworthy, I could use that.  But
> I don't see how I can use corrupt parents data to repair itself.  I
> can't even know if I'm iterating in topological order...

Right. I've done a proof that I think is complete, and John seems to
agree. So we're good to go with this. (Of course, if
inventoryentry.revision is borked - if we have not added the right nodes
at the right time to tie the two graphs together properly, then the
proof becomes invalid; so in a strict sense we should check that too.)
But I don't think that that is necessary for this patch - unbreaking
pull is our immediate concern.


[checking by weaves]
> I thought for packs this would be an irrelevant issue, because we'd use
> a fetch that won't tolerate this kind of brokenness.

mmm. I have a fetch that won't tolerate this brokeness, but some caveats
apply:
 - fetching from a knit repo (e.g. upgrade) still tolerates it
 - errors of this sort are not detected during fetch.

I'm not entirely sure where to fix this best.

> > Is :
> > 
> >> +class _RevisionParentsProvider(object):
> > 
> > Really needed? If we check the revision is cached correctly at the start
> > of check
> 
> We would also need to do the same in reconcile, and make sure that they
> never get out of sync.  I thought it was just simpler to use the data
> that we actually trust, instead of a proxy.

Thats fair enough. I'm concerned about performance/memory I guess. If we
are accessing the real revisions earlier in the process it seems to me
that we should have the parent details cached - and the revision index
is such a cache.

Reconcile does not check for parents because we don't know of any way
they can be wrong in a knit format repository today.

> > I think its better to check for an attribute like I did in my recent
> > reconcile patch - _reconcile_does_inventory_gc is what I used. This is
> > better than a subclass check because it allows foreign formats to be
> > tested without requiring them to spuriously subclass Knits.
> 
> As we discussed it, only knits would support this, so I check for knits.

Well, I'm not saying 'packs will support it', I'm saying that foreign
formats may have the same error in them, and its cleaner to check for
the specific aspect that drives the problem rather than the class.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070828/91ebed93/attachment-0001.pgp 


More information about the bazaar mailing list