[MERGE] deprecated EmptyTree

Mon Jul 24 02:48:04 BST 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> I find it hard to conceive of 'tree' without a root. In my prior
> thinking, the root node is always present, because if it isn't, you dont
> have a tree. 

I don't have a problem conceiving of a tree without a root.  The empty
tree is meant to represent nothing, and I find it hard to conceive of
nothing having a root.

The effort in nested trees is toward making root as un-special as
possible.  It's just a directory that doesn't have any parents.  And
until you init, that directory has no id, so you shouldn't pretend it did.

>>> I realise this is different to what you have done - so I'd like to know
>>> what things it would impact negatively.
>> Well, I think it means a bit more special casing for operations that
>> compare trees.  diff / status -r 0..1 shouldn't show deletion of the
>> tree root.  (They don't show an add of the tree root right now, but I
>> think that's at least plausible, even if we decide not to.)
> 
> So, if iter_entries yields the root, we have a choice of special casing
> the change of root id value, or special casing the addition of a root
> node. I'm not sure which is nicer..

I think that representing a change in root id value would mean another
level of indirection, and I would strongly oppose that.

Representing it as a delete+add would be possible, but if we must
special-case delete+add, it's better to just special-case add.

>> If we do a merge of unrelated trees, and BASE has a TREE_ROOT for a
>> root, and OTHER has UNIQUE_ROOT-asdf for a root, and THIS has TREE_ROOT
>> for a root, the merge will attempt to delete TREE_ROOT.  It won't
>> succeed of course, but it will produce a conflict.  Preventing that
>> conflict would require special-casing.
> 
> Well, I think there is a general-special-case for handling directory
> nodes that is different to handling file nodes, for merge. Which we
> haven't implemented yet. The point of this general-special case is to
> make merging between trees with different directory ids nicer - merging
> the contents rather than forcibly keeping them separate.

I don't think we can conclude anything about the contents of
directories, based on their filenames, anymore than we could with files.
 Some directory names are common.  Others are rare.  Some appear
multiple times in a source tree.

Say we are merging the shelf plugin into bzrtools.  Both contain 'test'
directories.

Say bzrtools is
.
./test

and shelf is also
.
./test

The desired output is
.
./test
./shelf
./shelf/test

By your proposal, we would wind up with the shelf tests intermixed with
the bzrtools tests.  We would also be unable to fix the situation,
because one of the 'test' file ids would be lost.

> I dont see the
> root node needing root-specific-special casing if this is done.

The handling you describe only handles duplicate adds.  But the merge
scenario I describe has a spurious tree-root deletion, and this doesn't
handle that.

>> So it seems to me that having a root id for the empty tree doesn't fit
>> our model very well, and doesn't provide much convenience (especially
>> once EmptyTree is deprecated).
> 
> Well, an 'empty tree' will still remain, but it will just be accessed
> via a repository request always.

But not very convenient.  Much easier to just do Inventory().

> Anyhow, AIUI you are saying 'when you deleted the root node from the
> empty tree in your tree-roots branch, a lot of stuff had to be
> changed.'. To me thats a sign that we might be better of not deleting
> the root node at all, rather having it change value at the first commit.

No, I don't agree with that assessment.  Most of the code that
instantiated EmptyTree was test code.  Most of the code that I had to
change would also have to be changed if the empty tree had a root.

In any case, nested-trees is trying to fix some flaws in the original
design.  It's not surprising that that there would be some pain in doing
that.

> 
> Well, I want it for consistency, to avoid special cases, and to allow
> the tree interface to be understandable.

I don't believe you will achieve that.  I think this will produce more
special cases, because we'll have to deal with a delete/add pair,
instead of just an add.

> For me, changing the id of the root is a lot cleaner than adding one,
> but thats because I think a tree needs a root to be a tree. (In fact,
> having Tree and the root be separate objects does not make all that much
> sense to me, but thats a different discussion).

I don't think I understand this concept of having a root be a tree.  If
the tree == the root, then two different trees must have different roots.

>> I'm just concerned that's Roberts request is just pushing the bugs
>> further down. Because once we have real roots, then even an empty
>> working tree won't match EmptyTree. 
> 
> I dont see why it wont match :).

Because TREE_ROOT != MY_UNIQUE_TREE_ROOT-asdflkjh

> In fact, my tree interface tests will
> assure that it is. I think that an empty working tree *may* be different
> to an empty tree, once its had a root id assigned to it, but only after
> that.

But root ids are assigned when working trees are initialized.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFExCbU0F+nu1YWqI0RAvP2AJsHcWsJ9KVgn0yrZCHWees9/7FT0QCdGKix
cUvOGDHoZxCexw8as9e+/wc=
=ji82
-----END PGP SIGNATURE-----