iter_changes, delta consistency and revert

Martin Pool mbp at canonical.com
Fri Aug 21 03:29:42 BST 2009


2009/8/4 Robert Collins <robertc at robertcollins.net>:
> So, I'm trying to land this iter_changes change. Theres one failure that
> has me concerned. I've attached the diff for it below.
>
> Now, its a bit hard to read like this, or even in context, so let me
> make it more obvious:
>
> bzr init
> mkdir -p a/d/e
> mkdir f
> touch a/b a/c a/e f/g f/h f/i
> bzr add
> bzr commit -m 1
> bzr mv a/b f/b
> bzr mv a/d/e f/e
> bzr mv f/g a/g
> bzr mv f/h h
> bzr mv f j
> bzr st a
>
> Now, the last command before my patch does this:
> $ bzr st a
> renamed:
>  a/b => j/b
>  a/d/e/ => j/e/
>  f/g => a/g
>
> but after applying the patch:
> $ bzr st a
> renamed:
>  a/b => j/b
>  a/d/e/ => j/e/
>  f/ => j/
>  f/g => a/g
>
> Now, you may be wondering why 'f->j' is in the status output. Its there
> because without it the path of j/b, if applied to an inventory delta,
> would be wrong. This matter for dirstate and other path based storage
> systems. It also matters for humans, if we say we commit j/b, but
> actually commit a/b, thats really confusing - at best.
>
> The problem turns up in revert though. Revert uses iter_changes to
> determine what to revert, and the inclusion of j in the output causes it
> to revert the f->j rename when 'bzr revert a' is invoked. At a human
> level, I would in fact expect 'revert a' to grab the things I'd renamed
> out of a and put them back, and not to touch other things that look like
> they are incidentally connected.
>

> There is some friction here:
>  - revert wants 'enough changes to put specified things back the way
>   they were'
>  - commit and merge want 'enough changes so that the output tree looks
>   like the user expects and isn't corrupted'
>  - status wants to 'show what commit will do'
>  - I think revert is using iter_changes in a symmetrical way, but
>   inventory deltas built on iter_changes are not symmetric, and the
>   iter_changes work I've been doing is likewise not symmetric.
>   Specifically tree.iter_changes(basis, specific=foo) !=
>   basis.iter_changes(tree, specific=foo), where expansion for
>   consistency is concerned.
>
> I have thought of of a few options:
>  - inspect the iter_changes output inside revert, and discard things
>   that are not needed.
>  - make consistent expansion an optional flag to iter_changes and have
>   revert use that.
>
> However, we've had a number of bugs (going from memory) to do with
> revert and dirstate; I'm not at all sure that this is a coincidence -
> the very bug I'm fixing, to ensure consistent deltas, may well prevent
> those bugs in revert.
>
> I'd rather not have more flags to iter_changes, I think being
> conservative and expanding as needed is the safest option. I think
> revert is well placed as a special-case (it doesn't need consistency in
> output, as its applying things in reverse) to manually filter the
> transform it uses back down - by using a simple check that one of the
> paths of each item it gets are inside the specified paths.

> For now, I've made the failing test a KnownFailure, and would like to
> land the changes. This is the bulk of the work, and can be tuned in
> various directions quite easily.

I think getting the other changes in for 2.0 is quite desirable and it
would be ok to put in a KnownFailure for some cases that aren't
handled ideally yet. Revert in this case is not very common and
although you can debate what would be best, the behaviour your patch
gives is at least reasonable - in other words the bug that the rename
is reverted would be Medium, not High or Critical.

I think that "bzr status $a" should show you what "bzr commit $a" will
do, and it's not unreasonable for "bzr revert $a" to do precisely the
opposite of that.

That said iter_changes is becoming an important API and concept so
it's worth being clear about just how it's meant to work.

I really dislike the idea of iter_changes giving data that's almost
but not always safe to reverse.  However, if it would be much more
expensive to make sure this is true, then I suppose we can handle it
by documenting it should never be done.

Is there a good reason why revert needs to reverse the delta by hand?
I might have written it that way but given the state of the code now
it'd seem better to get the delta from wt to basis and apply that.

> === modified file 'bzrlib/tests/blackbox/test_revert.py'
> --- bzrlib/tests/blackbox/test_revert.py        2009-03-23 14:59:43
> +0000
> +++ bzrlib/tests/blackbox/test_revert.py        2009-08-04 00:59:44
> +0000
> @@ -96,7 +96,9 @@
>         self.failUnlessExists('a/b')
>         self.failUnlessExists('a/d')
>         self.failIfExists('a/g')
> -        self.failUnlessExists('j')
> +        self.expectFailure(
> +            "j is in the delta revert applies because j was renamed
> too",
> +            self.failUnlessExists, 'j')
>         self.failUnlessExists('h')
>         self.run_bzr('revert f')
>         self.failIfExists('j')

This is a classic case where either the KnownFailure or a comment near
it needs to point to the bug, otherwise the comment is too cryptic for
someone wanting to fix it or know what the right behaviour is meant to
be.

-- 
Martin <http://launchpad.net/~mbp/>



More information about the bazaar mailing list