Confusion over locations

Barry Warsaw barry at canonical.com
Thu May 15 16:14:03 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On May 14, 2008, at 7:06 PM, Aaron Bentley wrote:

> Barry Warsaw wrote:
>> Aaron went on to explain that afahk,
>> submit_branch was used by merge, send, pqm-submit, '-r submit:' and  
>> an
>> internal review-submit plugin.
>
> I suppose I should also add that for at least 4 of these,
> submit-location is used due to my influence :-)

:)

>> 'help merge'
>> does say: "If neither is specified, the default is the upstream  
>> branch
>> or the branch most recently merged using --remember".  Well, I would
>> have thought the "upstream branch" would be the parent branch by
>> default, or my-shared-repo in the example above.
>
> merge previously did use the parent branch, and when I changed it, I
> should have updated that documentation.  My apologies.

Ah cool, np.

>> Further, it's confusing to me that 'bzr merge' uses submit_branch but
>> 'bzr pull' uses parent_branch.  In my own mental model of Bazaar,  
>> these
>> two commands would calculate their default locations in exactly the  
>> same
>> way.  Why is my model wrong?
>
> Merge and pull are designed for two different relationships.
>
> Pull is for mirrors-- a relationship where one branch is a copy of
> another.  A good example would be my local mirror of bzr.dev.  You
> should always pull into a mirror, never merge into it.  If you updated
> by merging, then you'd have to commit, and then you'd have a diverged
> branch, not a mirror.
>
> Merge is for diverged branches.  A good example would be my "bzr.ab"
> branch, where I put fixes too small to merit their own branch.  You
> should always merge into a branch, never pull into it.  If you pulled
> into it and it succeeded, your local branch would become a mirror.   
> You
> would lose your local point-of-view, so that log would no longer  
> prefer
> the revisions you'd committed, and their revnos would become dotted  
> revnos.
>
> Can a branch be in both relationships at the same time?   
> Absolutely.  A
> good example would be "bzr.ab", before I worked at Canonical.  I had a
> copy on my computer at home, and a copy on my computer at work.  I  
> would
> use "pull" to mirror my changes from one computer to another.  And I
> would use "merge" to integrate the latest changes from (my local  
> mirror
> of) bzr.dev.  I think this is not an uncommon situation.
>
> So that is why "pull" and "merge" do and should have different  
> locations.

That all makes sense.  I have a local mirror branch of upstream and  
all I ever do with is is pull from its remote parent.  The pull always  
seems to DTRT.

What about a situation where you want to synchronize your divergent  
branch with your local upstream mirror?  Clearly you want to do a  
merge because your branch has diverged, but what is the most sensible  
default branch to merge from?  This is a very common use case for me,  
and overwhelmingly, I want to merge from the divergent branch's  
parent, which is my local upstream mirror.  I call this the "hack on  
my own stuff" use case.

A second common use case for me is where I branch from my local  
upstream mirror, then merge a completely different remote branch into  
it.  Although I'll rarely do subsequent merges, when I do, I almost  
always want to merge from the remote branch again, not the local  
mirror.  I call this the "look at someone else's code" use case.

The difference in these two cases is that in the former, the very  
first merge I do will have no explicit branch argument, because I'm  
assuming/hoping it will automatically come from the parent branch.   
Here, I'll merge upstream fairly often to resolve any conflicts that  
have landed in the upstream branch by other developers.  My workflow  
in this use case is something like "branch upstream; hack; commit;  
hack; commit; push to remote divergent mirror; hack; commit; push;  
merge from local upstream mirror; resolve; commit; hack; commit; push".

In the latter case though, the first merge happens almost immediately,  
to get the remote branch I want to look at.  I will always give merge  
an explicit url, because obviously there's no way for it to know  
implicitly the branch I want.  In this case, subsequent merges are  
very rare, but if they happen at all, I almost always want it to merge  
from the same url as I specified in the first merge, i.e. the other  
person's remote branch.  So in this case, the workflow is usually  
"branch; merge remote-url; maybe commit; maybe merge again but  
probably not".

Why does merge by default choose the submit_branch?  When I think of  
"submit" I'm thinking about submitting the branch for review, or for  
pqm.  It's a request for a particular action that I'm making to  
someone or something else.  For me, that request is always to a remote  
mirror of my local divergent branch.  I can't understand why I'd ever  
want to merge /from/ that remote branch /into/ my local hacking  
branch, because it's never going to have anything in it that's not in  
my local hacking branch.  I've never needed to merge from that branch  
into my local branch.  What's the use case for that?

>> ISTM that the most logical default
>> location for 'bzr merge' is in fact parent_location, and that's  
>> almost
>> always what I want.  Having to always override it is an annoying  
>> speed
>> bump.
>
> You shouldn't have to always override it.  I almost never need to.   
> (The
> exception is when I accidentally cause two mirrors to become diverged,
> and have to get them back in sync.  This used to happen occasionally
> with bzrtools.)
>
> Could you explain a bit more about why submit_branch is almost never
> right for you?

Actually, it's always only half right :).

To submit my branch for review or pqm, I need it to be publicly  
available.  I do not publish branches from my own machine, and as you  
know, we have a shared work machine where we publish our branches to.   
When the branch is ready for review or pqm, we push a mirror of our  
local hacking branch to that remote machine.

My review/pqm submit request must name this remote mirror so that my  
reviewer or pqm can find it and perform its action.  In accordance  
with 'bzr help pqm-submit' for example, I set my submit_branch to a  
remote branch which happens to be the parent branch of my local  
upstream mirror.  It looks to me that this submit_branch becomes the  
target of the star-merge command, with the source being the remote  
mirror of my local hacking branch[1].  Given my current mental model,  
that all makes sense.

However, now you see the conflict!  To make the target of my pqm- 
submit inspired star-merge command be the right url, I have to set  
submit_branch to the remote parent of my local upstream mirror.  But  
of course, that means that a bare 'bzr merge' will also use that  
remote parent of my local upstream mirror by default, even though  
that's never what I want.  I want a bare merge by default to use my  
local mirror of upstream (not its parent, which is where submit_branch  
points), but I cannot set submit_branch properly to do so.

Sure, I could use --remember but over time that's just going to end up  
littering my locations.conf file with scores of long-defunct branches,  
and I still have to remember to use --remember ;).  That's seems like  
the wrong solution.

[1] There's an oddity to the star-merge target when I set  
submit_branch to my local upstream mirror.  In that case the target  
appears to use my public_branch, ignoring public_branch:policy.  That  
one I don't get and I can't see how that could possibly be correct.   
Do you know why that happens?  Is this a clue to why things seem  
broken to me?

>> You might say, well, just change submit_branch, but that's not quite
>> right either.  What if changing that breaks 'bzr send', 'bzr pqm- 
>> submit'
>> or some other vital command I don't yet realize also uses that
>> variable?  The overloaded use of submit_branch is problematic because
>> there's no way to really know what uses it, and even if I did, I  
>> suspect
>> that the different commands will conflict in what they want it to  
>> be set
>> to.
>
> When I changed merge to use a different location from pull, I
> contemplated adding a new location divorced from the submit_location.
> At the time, I wasn't convinced that it was necessary for the default
> merge location to be different from the submit_location-- they both
> describe a "diverged branch" relationship.  I was leery of adding  
> extra
> configuration complexity, so I decided not to add an extra  
> configurable
> location.  So far, we haven't had many complaints on that front, but  
> the
> possibility of separating merge_location from submit_location is  
> still open.

I would be in favor of that.  In one sense, you've traded the  
complexity of adding an extra configuration variable for the  
complexity of re-using an existing variable for multiple different  
(conflicting?) purposes.  At least for me, understanding the  
intertwined dependencies through experimentation and guessing is much  
more difficult than just having a single-purpose variable to make it  
do what (I think ;) I want.

But of course, it could be that my mental model is broken, I'm being  
dense, I'm using the tools incorrectly/inefficiently, or any  
combination of the above. :)

I appreciate your response!  I hope my explanation has made sense.  I  
can draw you some pictures if that would help.

Thanks,
- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkgsUzsACgkQ2YZpQepbvXFoZQCghfVCANRWh2fSyNbgjK3JisDH
Aw0Anj2qw1VDwoQd6fHBjDLmTrUZWA+A
=FNNB
-----END PGP SIGNATURE-----



More information about the bazaar mailing list