Sanitizing a branch
John Arbash Meinel
john at arbash-meinel.com
Mon Mar 16 16:54:14 GMT 2009
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Erik de Castro Lopo wrote:
> Hi all,
>
> I have a problem that is related to this one:
>
> https://lists.ubuntu.com/archives/bazaar/2009q1/054149.html
>
> I have a project that was imported into Bzr from GNU Arch many years
> ago and has history dating back to early 2004. During the time I've
> hacked on this I have added all sorts of cruft including tests files
> scraps of documentation etc.
>
> I'd now like to make a public branch of this repo with the public
> branch having none of the cruft but keeping the cruft in my
> private branch and still have the ability to merge between the
> two relatively painlessly.
>
> Is this possible? Any clues on how to do it?
>
> Cheers,
> Erik
So the hardest part, IMO, is that you want to continue to merge between
the two "painlessly", if you are okay with one-way syncing being
painless, and cherrypicking the other way, it is not *too* hard, though
doing it after the fact can be tricky. I'll draw some pretty ascii art :)
A
|
B
|
C ^- A-C are original 'useful' things
|
D - cruft added here
|
E - useful patch
At this point, you want to publish A,B,C,E but not D, but still have D
around if you need it. This is the hardest part of the operation. I'm
not entirely sure if 'replay' from the rebase plugin will do what you
want here, but you want to create a history that looks like:
A
|
B
|
C
|
E'
You can then have a private branch that looks like:
A
|
B
|
C
|\
D E'
| |
E |
|/
F
Now at this point, anything based on E' will merge cleanly into F and
beyond. You have an obvious merge base (E') which will show that we only
want the new data. As a quick example:
A
|
B
|
C
|\
D E'
| |
E |
|/|
F G
It is easy for us to compare E' to G, and only merge in those changes to F.
The big problem is when you want to merge stuff from your private branch
back into the public branch. Because there is no such clear merge base.
Consider:
A
|
B
|
C
|\
D E'
| |
E |
|/|
F G
|/
H
|
I
If we go to merge I into G, we look for the ancestor that is in both
branches, that supersedes all the other common revisions. If you look
closely, you can see that is G. However, G does not supersede D, E, F,
H, of which at least revision "D" we *don't* want to be made public.
Further, if D ever did end up in your ancestry, it would be propagated
to other users. (So even merging and then reverting the content without
reverting the revision pointer, would still send D out to third parties.)
So at that point, all merges from your private branch to the public
branch need to be "cherrypicks". In this case you can do:
cd public
bzr merge -r H..I ../private
And it should apply the appropriate content, and likely without
conflicts, etc. The main problem is that you have to know or figure out
what the best merge base is yourself. One way to handle it is to use a
tag of the "last merged", and then update it. So you would do:
bzr tag -r I last-merged-to-public
And then future cherrypicks would do:
bzr merge -r tag:last-merged-to-public..-1 ../private
cd ../private
bzr tag --force last-merged-to-public
You'll have a few other issues, as tags will get propagated by default.
So a merge will try to put 'last-merged-to-public' into your public
branch. (As only a revision-id pointer, but in the future we may also
try to copy the ancestry with it.) So you'll probably have to add:
bzr tag --delete last-merged-to-public
You could probably wrap up most of this complexity into a script or
plugin. The big issue is that if you forget or do it wrong, D ends up in
the history and your cruft gets 'leaked'.
Another way to do it is to split off the 'cruft' branch into its own
branch, and not have it merged 90% of the time. Only when you really
want to see it you do:
cd private
bzr merge ../cruft
# look at stuff
bzr revert
The big problems are if the 'cruft' needs to be actively maintained to
keep it up to date. Note also that updating the cruft should move your
'last-merged...' tag, because you don't want to try to cherrypick the
cruft changes. Though if you *did* accidentally merge a cruft
cherrypick, you should notice right away, because it should conflict,
since the cruft changes aren't there to have the updates applied to them.
Does this make sense? I can certainly try to discuss it more.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkm+hDUACgkQJdeBCYSNAAOwbgCgqyb8tYIn3bTUoXP+bdGrbG1L
HUYAoJianJRLF6i46ohbxXuneLDyCNPx
=wHpq
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list