how to revert update operation

Thu Jul 20 03:25:03 BST 2006

On Wed, 2006-07-19 at 09:23 -0400, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Robert Collins wrote:
> > On Tue, 2006-07-18 at 12:17 -0400, Aaron Bentley wrote:
> >>It should be illegal for wt.basis_tree() to be a non-ancestor of
> >>wt.branch.basis_tree(), so that part of the comment seems spurious.
> >>Similarly, wt.basis_tree() is an ancestor of wt.
> > 
> > 
> > Not at all. Take a branch, make two lightweight checkouts, commit in on,
> > update in the other, uncommit while you are there, then do a new commit.
> > Now go to the other light checkout, and the tree's basis_tree is not an
> > ancestory of the branch's basis_tree.
> 
> Yes, and you are now in an illegal state that we explicitly don't
> handle.  Uncommit is a power tool, and it's possible to cut your fingers
> off using it.
>
> If you try to preserve the ancestry for working tree, you're now
> starting to treat the working tree like a branch.

Uh, we must be miscommunicating here. I'm not trying to preserve the
ancestry of the working tree. But the code in update() is designed to
accomodate this state - I dont consider it illegal - though its
certainly a corner case. Its true that not all our code handles
uncommits gracefully - but that does not mean we should not try.

> >>Furthermore, it's not at all clear to me that the user wants the
> >>branch.last_revision() to be merged with the master.  If they have a
> >>bound branch, and their working tree is out of date with the branch,
> >>this suggests that someone else has committed or pushed into the branch.
> >> They may not be aware of these new revisions, and they certainly
> >>wouldn't have reviewed them, since they're not the current tree state.
> > 
> > 
> > For shared branches, the 'review before I accept it' workflow is not
> > present. If users want to do that they should not be using a shared
> > branch.
> 
> There is a big difference between applying the changes that are present
> in the master branch, and applying the changes that are present in the
> local branch.  They are two different locations, and may have different
> access controls.
> 
> Users don't mean 'apply the local branch changes' when they update, they
> mean 'apply the master branch changes'.  So it is surprising behaviour
> to apply the local branch changes.

Uh, the local branch changes are the changes they have done. They
certainly do -not- mean 'discard my work'. So we have to preserve the
work.

> I can't think of a good reason why both the checkout and the local
> branch would be out of date at the same time.  But here are some bad
> reasons:
> 
> $ bzr unbind
> # --local hasn't been implemented for uncommit, but AIUI, the plan is to
> # do so.
> $ bzr uncommit --local
> $ bzr bind foo
> $ bzr update
>
> It's pretty clear that the user didn't want to commit those changes to
> the master branch, so in this scenario, applying the local branch
> changes is The Wrong Thing.

UHHHH. --local has no effect after an unbind. And the only command in
this sequence that would push changes to the master is the 'bzr bind
foo' command, which I consider a huge defect in 'bzr bind'. I dont see
the connection to update here.

> Here's another:
> 
> $ cp -r foo bar
> $ cd bar
> $ bzr commit -m 'created new branch'
> $ cd ../foo
> $ bzr update
> 
> In this scenario, the user intended to create a new branch, but foo is
> actually a lightweight checkout, so bar is a checkout of the same branch.
> 
> Again, applying the local changes is The Wrong Thing.

This is what update is *intended* to do. If you have two working
instances of the same branch, and commit on one, update is *meant* to
bring across the commits from the other. Its what CVS and svn
shared-branch workflow delivers. If you dont want that, dont use update.
(seriously!). As for the user not getting a new branch - they had to
take an explicit action to get a 'checkout', and for *all* our
checkouts, 'cp -r foo bar' will preserve the fact that it is a checkout.
This is desired - its what users of CVS and svn and arch expect a copy
of a checkout to do.

> Here's another:
> A cracker does:
> $ bzr commit -m "add malicious code"
> $ bzr push sftp://example.com/home/some_honest_user/myproj
> 
> An honest user does
> $ ssh example.com
> $ cd myproj
> $ bzr update
> $ bzr commit -m "Add my latest, harmless changes"
> Generating OpenSSH signature
> Please enter your passphrase:
> Committed revno 76
> 
> The user knows that all the commits on the master branch are signed, and
> he has a checking rule, so he is confident that update can never
> introduce unsigned code into his tree.  He is wrong, and the cracker has
> now caused a signed commit containing malicious code to be generated.
> 
> Again, The Wrong Thing.

Well, this is more interesting. It suggests to me that update should be
showing much more info about the nature of the local changes - perhaps
an automatic status output after the update or something. But this
vector exists *without* the push:
A cracker does:
'scp hacked-file.c example.com:myproj/'

An honest user does
$ ssh example.com
$ cd myproj
$ bzr update
$ bzr commit -m 'add my latest harmless changes'

And has now committed that malicous code.

So I dont see update making the vector better -or- worse.

> So, whether the vector is uncommit, cp -r or push, auto-applying the
> local branch changes can be the wrong thing.
>
> >>So, I think if the master branch is defined, and the wt.last_revision()
> >>!= branch.last_revision(), you're in an error condition.  We should
> >>explain the situation to the user, and get them to run either update
> >>- --checkout, (update checkout to the last revision in the branch), or
> >>update --branch...
> > 
> > 
> > Well, we could do that, but I think its just exposing machinery to the
> > user for no good reason.
> > Perhaps I'm missing something?
> 
> My point is that this is a strange situation to encounter.  I don't
> think we can know what the user intent is, but if I had to guess, I
> would uncommit the local branch changes, and apply the master changes,
> because I think that is what the user is expecting.
> 
> But since uncommit is dangerous, we can't do that.
> 
> On the other hand, we can't just automatically do a double merge,
> because update is a destructive operation, and so it's very bad to guess
> wrong.

update should never be destructive. I have use cases for all the
scenarios I've outlined. There is no need to guess user intent: using a
shared branch is a deliberate end user choice which brings in the
workflow.

> Finally, there's the issue of conflicts.
> 
> I don't think it's valid to perform two operations in a row when each
> may produce conflicts.  But I'll address that in our "Status should not
> special case conflicts" discussion.
> 
> What I will say here is that, if we assume it's valid to do a merge when
> we already have conflicts, it's still not a nice thing to do.  Conflicts
> are often confusing, and it's nicest to deal with them in bite-sized
> pieces.  Conflicts on top of conflicts is even more confusing, and, in
> some cases, impossible to represent with out current conflict types.
> 
> So, if the first update will produce conflicts, I think it's nicer to
> stop and ask the user to resolve the conflicts, before doing a second
> merge operation.

This is possibly a really good approach: a two phase update, where if we
have conflicts we stop, and say 
'conflicts detected updating to local branch tip, please resolve the
conflicts, and then run update again to get the new commits from the
shared branch.'

Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060720/91804f2d/attachment.pgp