Warping minds with the phrase "changeset"
Aaron Bentley
aaron.bentley at utoronto.ca
Mon Jan 30 14:36:27 GMT 2006
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Martin Pool wrote:
> On 29 Jan 2006, James Blackwell <jblack at merconline.com> wrote:
> One of them is, as you say, that a changeset is a description of changes
> to a whole tree, i.e. a set of patches, and that you can identify the
> changeset as a whole. We have these, as do most other modern systems,
> and at least some use the term "changeset", at least informally.
What you are saying is true, from a certain point of view. But claiming
we have changesets is the kind of thing Obi-wan Kenobi would say, and
most of the time, I think it's simpler to just say, "Dath Vader's your
dad, kid."
Yes, as long as we have two revisions in our repository, we can infer a
changeset between them. But it is the revision snapshots that we store,
not the revision deltas. By definition, snapshot-based storage can be
transformed into changeset-based storage and vice-versa, but these
approaches have different strengths and weaknesses.
Snapshot-based storage is less fragile than changeset-based storage,
because revisions are independent. This makes it cheap to validate any
given revision. Because changesets each depend on the previous
changeset for their meaning, a pure changeset-based system must have no
ghosts and validation must work from the beginning of history onward.
The Gnu Arch development line, for example, has many inconsistencies;
file permissions are changed from A to B, then from C to D, without ever
being changed from B to C. Some of its changesets are internally
inconsistent, also. It is hard for a snapshot to be internally
inconsistent.
On the other hand, changeset-based storage does allow for more
flexibility in the type of changes stored. Darcs token-replace patches
are a good example of the kind of thing Bazaar-NG cannot do easily.
James is portraying Bazaar-NG as a system with changeset-based storage,
and I think that description harms understanding. A person who hears
that may ask me whether we have token-replace changesets. To which I'll
reply "No, Bazaar-NG doesn't store changesets". And then I'll be making
our community guy look like a liar. Another person may decide that they
don't want to use a changeset-oriented system, because it's too fragile.
And people who try to understand the code based on James' explanation
will have a very hard time.
Let's look at an example:
> The rename problem is solved by keeping each commit together in
> something called a "changeset". Since changes are now kept together in
> a changeset, other things can be kept as well. The RCS can even record
> that a file was renamed or deleted.
This does not resemble the way we handle renames. The truth of the
matter is that files have ids. A curious user can list them with 'bzr
inventory --show-ids'. In each commit, we record the id, name and
parent directory of each file. We certainly can *infer* renames, but
that's not what we store.
This is a good thing, because file-ids made file identity very easy to
establish. Systems like Monotone store renames rather than file-ids.
But in order to do tree-wide merges, they still need to establish file
identity. In order to do this, they must trace the rename history of
every file back to the base revision. So file-ids make merging more
efficient.
They're also more flexible. In Monotone, if the file in THIS did not
exist in the base revision, it's not considered the same as a file in
OTHER with the same name and identical contents. With file-ids, it is
possible for THIS and OTHER to introduce the "same" file.
So let's play to our strengths. Yes, there is a perspective from which
we're storing changesets, but that perspective is a mathmatical one, not
a natural one. And all the other documentation that people encounter
will describe a system that stores data as snapshots. Let's just keep
it simple.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFD3iRr0F+nu1YWqI0RAipdAJ4nCzYMWi15ihVVkERsdILt+dKUiQCfbu4o
i+Uu4UDsHYzpRDHwavfzkhw=
=Rwfn
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list