[PATCH] Changeset plugin passes all tests

John Arbash Meinel john at arbash-meinel.com
Fri Jul 15 21:53:37 BST 2005

Aaron Bentley wrote:
> Hi all,
> I've got the changeset plugin passing all tests.  The fixes involved

Thanks for doing this.
> 1. Accomodating explicit root IDs better in MockTree and elsewhere
> 2. Fleshing out MockTree to make it a more functional implementation of
> both the Tree and Inventory interfaces
> 2. Fixing the way the text of a rename is decoded (we were getting
> '/dir' instead of 'sub/dir'.  The leading '/' caused infinite recursion.
> 3. Fixing calls to note_rename (it doesn't take a text_id)

Yeah, I hadn't tracked that down. At one point I was going to change the
interfaces, and then I changed it back.

> 4. Causing ChangesetTree to return None when asked about the text_id of
> a directory or the root directory*


> 5. Made ChangesetTree._get_inventory simpler and more robust
> On ChangesetTree._get_inventory():
> In constructing and Inventory, it's necessary to always add parents
> before children.  The previous code tried to add parents when necessary,
> but it couldn't cover all cases that way.  Now it gets a list of all ids
> in the new tree, sorted by path length.  Since shorter paths come first,
> parents are always added before children.  Simple and hard to get wrong.

Sounds good to me.

> On ChangesetTree.inventory:
> I'm not sure why ChangesetTree derives from Tree.  Since we're using a
> dynamic language, we don't have to use inheritance to implement that
> interface.

Just for commonality. Certainly there is no explicit need for it.
> I intended to implement both Tree and Inventory in terms of
> ChangesetTree, as I did with MockTree.  That would mean that
> ChangesetTree would always supply consistent data.

I have no problems with that. I did it mostly for *my* ease in
implementation. The problem is that an Inventory is mostly a small
container for a set of InventoryEntries, which have their own
parent->child relationships.

At one point, I was actually thinking to copy the base_tree inventory,
and then keep it up-to-date as you note_* on it. To me, it would be nice
if every call to ChangesetTree.inventory did not have to completely
regenerate the tree each time.

> The current inventory attribute is essentially a cache of ChangesetTree
> data.  However, this cache is not invalidated when the tree-construction
> methods are called (e.g. ChangesetTree.note_rename()).  This means that
> it is possible for ChangesetTree.path2id() and
> ChangesetTree.inventory.path2id() to produce different answers, and I
> believe this is a bad thing, though the current client code is well-behaved.

It is *sort-of* a cache. Because it is a property, every time it is
accessed, the function is run, and it re-creates the inventory. So it
really is invalidated, as long as you don't copy the inventory, and then
hang on to it.

> While it would be slower to use ChangesetTree as its own inventory, I
> don't know how much slower it would be.  The difference might well be
> negligible.

I'm sure it out-performs completely rebuilding the inventory every time
it is requested. :)

> Aaron
> *The root directory has its own file 'kind', but it probably shouldn't,
> because that means that if you turn it into a subdirectory, its kind
> will change.

I would tend to agree. It is just another directory, which happens to be
at the root.
The only reason to special case it, is if we wanted to have something
like "bzr diff" report nothing on a new tree, whereas I believe if we go
with "EmptyTree has root = None" then "bzr init; bzr diff" would naively
show that there was a new directory added (which in one frame of mind is
perfectly valid).

By the way, I think it would be good for a branch to cache it's root id.
Because there are several places where we call "branch.get_root_id()"
which means that it has to completely parse the inventory XML, just to
get the root_id from the very beginning.

If you want, we can cache it, and then make sure it is either updated or
invalidated on any "_write_inventory" access. It doesn't prevent having
2 branches open on the same path, and having one update the working
tree, and the second isn't aware of it.
I can't see of any way to handle that other than either singletons or
constantly re-reading all branch attributes, and singletons were frowned
upon in the past.


More information about the bazaar mailing list