Workflows, rebase, patch theory

Thu May 8 08:50:07 BST 2008

Stephen J. Turnbull wrote (2008-05-08 09:31 +0900):

> Both the tree object and the commit that refers to it are in the
> repository, accessible via their SHA1s and any other tags or branches
> that refer to the commit object.  States *and history* are preserved.
> Only the reference via the rebased branch name is not.
> 
> If there are no references[1] left, then objects including tree states
> and commit history *can* be garbage-collected.  But that only happens
> if the user insists.

I'm sure you know this already but I'll add that old and possibly
abandoned history states (commit objects) are also available via Git's
reflog which by default tracks branch and HEAD states for last 90 days.
The number of days can be configured but after that point Git will (by
default) automatically also garbage-collect all commit and other objects
which aren't referred to by branches or tags. This happens through "git
gc --auto" which is run by some Git porcelain commands from time to
time. When and how this happens is also configurable and can also be
turned off.

A side note: It was just yesterday when I learnt to appreciate the
reflog. I was preparing a small patch to be sent to an upstream bug
tracker. I committed my changes, turned them to a patch file and reset
the branch's HEAD again to where the upstream HEAD is. While playing
with the upstream bug tracker I deleted my patch file before the bug
report and patch were actually sent (I just thought they were).
Fortunately the "thrown away" history state was still there and
accessible via HEAD@{10.minutes.ago}. It could be used to create new
branch for example. What I needed was just "git format-patch -1
HEAD@{10.minutes.ago}".

> "Why" has always been clear.  However, Linus (and many others) clearly
> consider rebase to be a useful tool in *some* circumstances, and it
> has been added to bzr as plugin.  People will use it, so it's better
> to identify the problems with it correctly so they can use it
> judiciously.

Rebase is definitely an excellent tool. Here's my personal real life
example: I have a project which started in a private repository. It was
a Subversion repo at first but converted later to Git. When I started
the project it was the first time I used any version control software.
Partly because I was a VCS newbie and partly because the project was
private the commit history became mostly unreadable and useless: stupid
commit messages and totally unrelated changes in same commits.

Now the project has become useful for other (Finnish) people too and I'm
doing some heavy history rewriting to make my repo useful for others.
I'm kind of throwing away the project's real VCS history but since the
history was useless anyway I'm making it first time actually useful.
With Git's rebase --interactive and the index functionality I have split
commits, joined commits and written good commit messages. Of course
after the repo is made public I won't be rebasing or editing anything
I have made public.

So I agree with Stephen: Rebase is very useful tool and people certainly
use and need it sometimes. It's good to teach people how to use it
properly and with understanding the consequenses.

It seems that this new linux-next[1] testing tree is rebased daily
against Linus's tree. Tags are added to every rebase state so they can
easily be returned later. I believe linux-next's history would become
completely unreadable in no time if it didn't use rebase but just merged
everything thrown at it and also got updates from Linus's tree.
Criss-cross merges between individual developers and different subsystem
and testing trees on kernel development level would make the history
a complete mess. It's somehow very interesting mess already. :-)

---------------
1. http://linux.f-seidel.de/linux-next/pmwiki/