how to revert update operation

Thu Jul 20 04:12:24 BST 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> On Wed, 2006-07-19 at 09:23 -0400, Aaron Bentley wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Robert Collins wrote:
>>> On Tue, 2006-07-18 at 12:17 -0400, Aaron Bentley wrote:
>>>> It should be illegal for wt.basis_tree() to be a non-ancestor of
>>>> wt.branch.basis_tree(), so that part of the comment seems spurious.
>>>> Similarly, wt.basis_tree() is an ancestor of wt.
>>>
>>> Not at all. Take a branch, make two lightweight checkouts, commit in on,
>>> update in the other, uncommit while you are there, then do a new commit.
>>> Now go to the other light checkout, and the tree's basis_tree is not an
>>> ancestory of the branch's basis_tree.
>> Yes, and you are now in an illegal state that we explicitly don't
>> handle.  Uncommit is a power tool, and it's possible to cut your fingers
>> off using it.
>>
>> If you try to preserve the ancestry for working tree, you're now
>> starting to treat the working tree like a branch.
> 
> Uh, we must be miscommunicating here. I'm not trying to preserve the
> ancestry of the working tree. But the code in update() is designed to
> accomodate this state - I dont consider it illegal - though its
> certainly a corner case. Its true that not all our code handles
> uncommits gracefully - but that does not mean we should not try.

You are trying to preserve the text of the working tree commits.  If
you're not setting wt.last_revision() as a pending merge, I don't see
why not-- it would be consistent with what you're doing here.

>>>> Furthermore, it's not at all clear to me that the user wants the
>>>> branch.last_revision() to be merged with the master.  If they have a
>>>> bound branch, and their working tree is out of date with the branch,
>>>> this suggests that someone else has committed or pushed into the branch.
>>>> They may not be aware of these new revisions, and they certainly
>>>> wouldn't have reviewed them, since they're not the current tree state.
>>>
>>> For shared branches, the 'review before I accept it' workflow is not
>>> present. If users want to do that they should not be using a shared
>>> branch.
>> There is a big difference between applying the changes that are present
>> in the master branch, and applying the changes that are present in the
>> local branch.  They are two different locations, and may have different
>> access controls.
>>
>> Users don't mean 'apply the local branch changes' when they update, they
>> mean 'apply the master branch changes'.  So it is surprising behaviour
>> to apply the local branch changes.
> 
> Uh, the local branch changes are the changes they have done. 

They may not be changes that they have done, or they may be changes that
they did not intend to include, as shown in my examples below.

> They
> certainly do -not- mean 'discard my work'. So we have to preserve the
> work.

They *may* mean 'discard my work'.  This is why I say user intent is
unclear.

>> I can't think of a good reason why both the checkout and the local
>> branch would be out of date at the same time.  But here are some bad
>> reasons:
>>
>> $ bzr unbind
>> # --local hasn't been implemented for uncommit, but AIUI, the plan is to
>> # do so.
>> $ bzr uncommit --local
>> $ bzr bind foo
>> $ bzr update
>>
>> It's pretty clear that the user didn't want to commit those changes to
>> the master branch, so in this scenario, applying the local branch
>> changes is The Wrong Thing.
> 
> UHHHH. --local has no effect after an unbind.

- --local is not implemented for uncommit.

Currently, when you have a checkout, uncommit acts on both the
wt.last_revision and the branch.revision_history.

When I the change to make uncommit affect wt.last_revision, it was
suggested that we should also provide uncommit --local, to adjust only
the wt.

I suppose update -r -2 (in an unbound branch) would have the same effect.

>> Here's another:
>>
>> $ cp -r foo bar
>> $ cd bar
>> $ bzr commit -m 'created new branch'
>> $ cd ../foo
>> $ bzr update
>>
>> In this scenario, the user intended to create a new branch, but foo is
>> actually a lightweight checkout, so bar is a checkout of the same branch.
>>
>> Again, applying the local changes is The Wrong Thing.
> 
> This is what update is *intended* to do. If you have two working
> instances of the same branch, and commit on one, update is *meant* to
> bring across the commits from the other. Its what CVS and svn
> shared-branch workflow delivers. 

In my experience, this is nothing like CVS.  CVS/SVN do a three-way
update, not a four-way one.  This may be your extrapolation of CVS/SVN
to this situation, but that's not the same thing.  CVS never provided a
way to check a branch out into another branch, only into a checkout.

> If you dont want that, dont use update.
> (seriously!). 

But then, how do I get changes from the master?  That's all update does
99.9% of the time, and it's all I'm expecting update to do.

> As for the user not getting a new branch - they had to
> take an explicit action to get a 'checkout', and for *all* our
> checkouts, 'cp -r foo bar' will preserve the fact that it is a checkout.

Yeah, but checkouts look very similar to standalone working trees.  I am
saying a mistake like this is possible.

> This is desired - its what users of CVS and svn and arch expect a copy
> of a checkout to do.

It is not what a Darcs user would expect cp to do.  Both types of users
exist.  Hence the possibility for mistakes.

> Well, this is more interesting. It suggests to me that update should be
> showing much more info about the nature of the local changes - perhaps
> an automatic status output after the update or something. But this
> vector exists *without* the push:
> A cracker does:
> 'scp hacked-file.c example.com:myproj/'

True, but that case is much easier to detect.

> 
> An honest user does
> $ ssh example.com
> $ cd myproj
> $ bzr update
> $ bzr commit -m 'add my latest harmless changes'
> 
> And has now committed that malicous code.
> 
> So I dont see update making the vector better -or- worse.

I see it making the attack harder to detect, and therefore worse.

I suppose an even nastier approach would be to put bogus revisions into
the user's local repository which appeared to be the revisions that
update would apply.

>> On the other hand, we can't just automatically do a double merge,
>> because update is a destructive operation, and so it's very bad to guess
>> wrong.
> 
> update should never be destructive. 

Update is destructive, whether or not you think it should be.  It is
intended to combine your local, uncommitted changes with the mainline
changes.

Once the changes have been combined, they cannot reliably be
disentangled.  You can't get your working tree back to the state it was
in.  Destructive change, as Alexander discovered.

> There is no need to guess user intent: using a
> shared branch is a deliberate end user choice which brings in the
> workflow.

Using a shared branch is fine, but it doesn't naturally result in
situations where the checkout is out-sync-with the branch AND the branch
is out of sync with the master branch.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEvvSX0F+nu1YWqI0RAmDNAJ9V1c0LNB0yhjzC8O01gaE+q5KIQgCdHkfw
t21nxKJ2UqYxNJWCQSqyAG8=
=95dI
-----END PGP SIGNATURE-----