[RFC] Why DVCS Matters

Fri Oct 12 02:27:52 BST 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ian Clatworthy wrote:
> John Arbash Meinel wrote:
>> Ian Clatworthy wrote:
>>> I'm giving a paper later this year on why DVCS technology matters.
>>> Elliot Murphy and Martin Pool have kindly reviewed earlier drafts but it
>>> probably needs another round of changes before I'm ready to call it
>>> 'final'. The latest draft is online here:
>>> http://ianclatworthy.wordpress.com/2007/10/11/why-distributed-version-control-matters/.
>>> I'm still yet to address one of Martin's bits of feedback, namely that
>>> "branch tracking scales better than patch tracking" isn't well
>>> explained. Does anyone have a good reference covering that topic to save
>>> me the effort? :-)
>> Branch tracking (as Martin is mentioning it) means that you have a single
>> object (the revision id) which defines the complete ancestry.
>>
>> "patch tracking" means keeping track of the set of patches that have been
>> applied in this branch. Which is what Darcs (and Arch) did.
>>
>> The big difference is in how things scale.
> 
> Ah. I think that just shows that I really need to explain that point
> better because that's not what I meant by "branch tracking vs patch
> tracking" at all. (Your point is a good one but it justifies how best to
> implement a DVCS rather than why DVCS is inherently better than a
> central one.)
> 
> What I meant was that keeping a list of (dumb) patches you want (e.g.
> when downstream) and reapplying them when a new upstream version comes
> out sucks. Far easily to do that dance by either having:
> 
> 1. your own branch with patches applied and merging the new work, or
> 
> 2. applying bundles with the full intelligent metadata available.
> 
> Does anyone disagree? Does anyone have any ideas on how best to explain
> that if the above isn't clear enough?
> 
> Ian C.
> 

Here are some thoughts on branch tracking versus patch tracking...

1) Branch tracking allows fine-detailed annotations to be associated with
commit messages. So while you may have 10 lines changed, you can have 10
different specialized commit messages for each logical change. So rather than
knowing "this was changed as part of feature X" you know "frizbans are broken,
and need special care when doing X".

2) Partial application. If I have a branch with 3 commits, and somebody merges
1 and 2. And then goes to merge my branch, they will only get 3. Even if the
other sections have been updated since. If all you have is a patch, portions of
it will either already be applied (which patch sort of handles) or conflict
(because they have been updated after the change).

3) renames etc can be handle with bundles. Patches could be extended to do
this, but don't yet. Like, renaming a directory with patch just gives you a
whole lot of deleted + added file texts.

4) It is somewhat unnatural to update a patch when you need to make changes. It
isn't terrible, but it isn't quite the same as just committing, which you would
have done anyway.

5) 3-way merging rather than simple patch application. This has a bit to do
with (2), but just in general you get better results if you can use the BASE
revision as context when merging. It can also help avoid getting spuriously
clean patch applications. (If you have 2 for loops that look a lot alike, patch
may decide to apply your change to the wrong one. 3-way will usually conflict
if it isn't clear.)

6) Scaling to a community. There are ways to do this with patches, but you
still would like to have a way to know that you have the changes from Joe and
Mary, but you are missing the ones from Greg and Susan.
Branching/committing/VCS in general tracks that sort of thing for you. With
patches you would want to record it separately somehow.

7) Working with the same tool. Day to day I work with a VCS as I'm doing just
general development. (Same as people would do with SVN.) When I switch
projects, it is nice to not have to switch my workflow. (Suddenly I only get a
readonly checkout, and I have to keep 2 of them so that I have the pristine
upstream to do "diff -ur upstream mine > current.patch".) We certainly
recommend keeping a mirror of upstream even with Bazaar, but you don't need to
switch from using your VCS tool to using diff/patch when you want to contribute
to a different project.

8) Uncommit. If you want to be able to go back to a previous version of your
patch, you have to keep a copy of all of them. Which you can certainly do, but
you end up either adding your patches to a VCS, or you have a directory with
patch-1, patch-2, patch-3...
I remember Mercurial mentioning that one of the benefits of how their patch
queue addon worked was that you could *version* the queue.

I'm sure there are more, but I have to get going.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDs2XJdeBCYSNAAMRAmvOAJ0dcBe68jfNg7Wgl7w0b7YylA0v0ACgyLGK
j8QX3S8CAma8j3D0jPlAfXw=
=qa1N
-----END PGP SIGNATURE-----