Why I prefer rebase to merge. Is there a better alternative?

Thu Oct 15 19:45:29 BST 2009

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gioele Barabucci wrote:
> John Arbash Meinel wrote:
>>> This is why I use rebase instead of merge. Rebase performs this
>>> synchronization without recording anything and without "polluting" the
>>> history.
>> It doesn't really. You still have to resolve the conflict. It just
>> happens that now you are throwing away what you used to have, and
>> replacing it with something that looks conflict free.
> Well, not exactly. Maybe, if I branched mainline just half an hour later, 
> the conflict would not be there. Suppose you branch feat1 from mainline 
> while mainline contains an off-by-one error. You miss that bug and code+test 
> your feature so that everything works fine (obviously your feature contains 
> another bug but you do not know that). Few minute after you branched your 
> branch the off-by-one error gets corrected. Now, when you merge (with no 
> file conflicts: the fix is in a file you did not modify) your test 
> infrastructure will tell you that you are failing some tests. You go ahead 
> and solve your bug. Now, is the fact that you merged important? As you said, 
> it is a matter of personal taste; I do not think it is important to have it 
> recorded.

So if you branched from mainline half-an-hour later then the code you
would have written would have been different.

At least from your comment, after merging you have failing tests. Which
tell you that you need to go fix your code. If you had branched
differently, you would have solved that bug differently. (Earlier, etc.)

Does it matter? Consider if you were working on your feature and had a
commit that said:
  "Finally, all the test suite is passing again"

Now after rebasing, that comment has been invalidated. Because now all
of those commits that you rebased are built upon the new trunk code,
which exposed some failing tests in your branch.

Does it *matter*? About as much as it matters whether you have a "and I
merged X" in the history. In *my* view, it takes more work to rebase,
and if the net result is just a tradeoff of one set of issues for
another, it generally isn't a net win to do more work to just redirect
the set of problems I have. As always YMMV.

...

> 
> In general, why should upstream be interested in *when* I started developing 
> something? I understand that there are cases when this is desirable (for 
> example DaggyFixes) but this is not always the case. And I think that in the 
> long run, bigger projects will suffer from the complex tangled web of 
> revisions that merge creates when it is not used to /merge/.

In the Bazaar ancestry, we pretty much never use rebase. (I don't know
any developers that do use it, I could be wrong.) If you do "bzr qlog
bzr.dev" you have a rather nice looking ancestry for tracking down where
things happened and what went on. A lot of this is because of how the
tool presents changes. Merges from trunk are displayed but not
highlighted, most history is hidden until you go look for it, etc.

We also have a project policy that causes us to land each feature onto
mainline as a separate commit. (always 'bzr merge; bzr commit', never
'bzr pull' or 'bzr merge --pull'.) You get a chance to look at the
changes to the project as the simple set of diffs (features) along the
mainline. Which IMO is a much better view than what you would get with
rebase, unless you spend a lot of time carefully pruning your history,
collapsing and reordering patches, etc.

If you go back to my post, I did the reordering/cleanup after the fact,
and each of these changes were then landed as an individual patch into
bzr.dev. Leading to a nice clean ordered set of changes, backed by a
whole lot of real-world flux.

I certainly could have rebased that onto the tip of trunk, and gone
through the contortions to get 'mostly the same ancestry, but not quite
the same'.

I would generally say that while there are lots of ways forward, and
rebase is *a* way, it is rarely the *best* way. Subject to the
constraints that some upstreams feel it is the best way, and you are
subject to the whims of the people who will merge your code.

> 
> Couldn't we have a merge or a sync command that ingeniously move things  
> around preserving all the commit information while giving the users a 
> rebase-style history?
> 

I don't quite understand what you expect to get that isn't rebase but is
mostly rebase. You can't "preserve all the commit information" and
reorder things. You have to synthesize a new history that mostly looks
like the old one. But is subtly different, potentially in key parts.
(consider if trunk shifted an offset by one, and then all of your
*rebased* patches are now off-by-one...)

I honestly don't really know when you would "merge without /merging/".
I'm happy that rebase exists for people who want to use it. My primary
argument is that you can generally get /equivalent/ results via an
alternative method that doesn't suffer the same negatives.

If you can describe how your 'sync' command would function differently
than merge and rebase, I certainly think people would find it worth
investigating. (Certainly you seem to feel it is different.) Can you
clarify a bit how you think it would work in practice?

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrXbckACgkQJdeBCYSNAAOfWACgzu5WwmDHlBp18s71eiNVgjsm
/wIAoJW5MAKzDbyQVOjBSAe3KL9Eyqic
=ABIW
-----END PGP SIGNATURE-----