[RFC] Changing dotted revno numbering (was Re: Echoing a post: bzr vs. git)

Thu Nov 6 21:09:44 GMT 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Matthew D. Fuller wrote:
> On Thu, Nov 06, 2008 at 10:36:36AM +0100 I heard the voice of
> Vincent Ladeuil, and lo! it spake thus:
>> Glad to hear that since this is an idea I'm thinking about[1] :)
> 
> So, here are a few of the reasons I like it offhand.
> 

...

I think you have some good arguments here. I just want to point out some
reasons for the current scheme, just to make sure we evaluate the whole
proposal.

1) Similarity to other systems.

   Both CVS and BK use similar numbering schemes to what we use. CVS was
   probably closer to our original 1.1.1.1.1.1 scheme, though you don't
   branch nearly as much there, so you don't notice the numbers going
   off to infinity. BK uses something very similar to our X.Y.Z scheme,
   though I don't know all the specifics of its internals.

   svn uses a single integer, as does hg AFAIK (which is the order in
   which it was inserted into this repo), and I don't believe git ever
   does anything other than parts of the hash. So these systems aren't
   really comparable.

   This isn't a huge thing, but where possible it is good to try to
   maintain similarity between tools.

2) Stable *between* branches.

   This isn't 100% true, but if we both have similar (not identical)
   mainlines, and we merge the same branch, we will get the same
   numbers.

       A
      /|\
     B C D
     | | |
     E F |
     |/ \|
     G   H

   From the view of both G & H, 'F' is number 1.1.2. In contrast, it
   would be 4.1.1/2 in G, and 3.1.1/2 in H (depending on whether you
   count up or down.)

   Now, if you do something more involved:

       A
      /|\
     B C D
     |/| |
     E F |
     |/ \|
     G   H

   The numbers for C and F do not change under the current scheme. Under
   the proposed scheme F may be stable (if we count the first parent as
   1), or will change 2 => 1 in the G branch (if we count farthest away
   as 1).
   And certainly the numbering for C changes under the new scheme. As it
   would go from a 4.X.Y to a 3.X.Y

       A
      /|\
     B C D
     |\| |  <= B merged into F
     E F |
     |/ \|
     G   H

   I don't think the numbers change here in any scheme.

     A      1        1
     |\     |\       |\
     B C    2 1.1.1  2 3.1.1
     |/|    |/|      |/|
     D E    3 1.1.2  3 4.1.1
     |/|    |/|      |/|
     F G    4 1.1.3  4 5.1.1
     |/|    |/|      |/|
     H I    5 1.1.4  5 6.1.1
     |/     |/       |/
     J      6        6

   This is the long-lived-repeatedly-merged branch case. It is kind of
   nice that the revisions from the same "feature" end up with the same
   prefix. I do respect Matthew's observation that the latter form helps
   make the mainline revision a "summary" of the changes that were
   merged in that revision. But it does encourage that a branch's
   lifetime is only "the distance since the last time it was merged".
   Perhaps that is more accurate for most people.

       A      1              1
      / \    / \            / \
     B   C  2   1.1.1      2   4.1.1
     |  /|  |  /    |      |  /    |
     D E F  3 1.1.2 1.2.1  3 4.1.2 5.1.1
     |/ /|  |/     /    |  |/     /    |
     G H I  4 1.2.2 1.3.1  4 5.1.2 6.1.1
     |/ /   |/      /      |/      /
     J /    5 .----'       5 .----'
     |/     |/             |/
     K      6              6

    This is the "long lived integration branch, repeatedly merged with
    intermediate revs" case. And the case that broke our old numbering
    scheme.

3) Merging other projects has the numbers start at 0.
   If you do "bzr log --long -r 3823" you can see that Aaron's shelf
   changes landing, and it is clear that it started life as a completely
   different project (and is the 16th time we've done so.)

   Is this a strong reason? I don't really know. But it *is* a feature
   of the current scheme that you would lose.

   Under both schemes, I think you have to walk back to the NULL
   revision on both branches to make sure that it hasn't been merged
   before. So "less history walking" isn't entirely true. Then again,
   when you have:

     A B  1 0.1.1  1 3.1.1
     | |  | |      | |
     C D  2 0.1.2  2 3.1.2
     |/|  |/|      |/|
     E F  3 0.1.3  3 5.1.1
     | |  | |      | |
     G H  4 0.1.4  4 5.1.2
     |/   |/       |/
     I    5        5

   Under the current scheme you still have to walk back to NULL, but
   under the proposed scheme you can stop once you've seen that D was
   merged earlier.

   These are the cases of "less history walking" that are being
   described. And applies regardless of where B comes from. I just
   wanted to point out that it isn't always less. A plain branch on both
   sides still has to walk all the way back to the revision it sprouted
   from, and you still have to check the ancestry of the revisions
   in-between, to make sure one of the other intermediate revisions
   wasn't merged. (In the current scheme, you have to walk all of A-G,
   in the proposed scheme you only have to walk G until you've found all
   ancestors.)

******

And for the pros

1) My biggest one is that it is the same scheme as how "bzr log --long"
   *shows* the revisions (clustered by the revision which merged them).
   So all of the indented revisions will start with the same revno. Kind
   of nice.

2) As mentioned "less history walking". I don't know how much less in
   practice. But certainly less.

3) Being able to see when a feature landed. This is especially helpful
   with stuff like annotate. As they will generally give you the
   revision where things were modified, and this allows you to determine
   quickly when that change actually landed.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkkTXRgACgkQJdeBCYSNAAMzxgCffPLhxeVwbzAtDx3izJ0eX9oV
8AkAoLdD5nJQSS7BXWEhBGWFumm8EgJB
=ZqbF
-----END PGP SIGNATURE-----