[MERGE] Nested trees: CompositeTree

Aaron Bentley aaron at aaronbentley.com
Thu Apr 23 20:31:49 BST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thanks for your review.  Sorry I haven't been able to respond sooner--
Launchpad needed me.

Ian Clatworthy wrote:
> Aaron Bentley wrote:
>> What if I just raised an AssertionError if self._lock_type was not None?
> 
> That would be good.

Done

> 
>>>> +            if path == '':
>>>> +                raise AssertionError('Subpath cannot be empty.')
>>> You want subpath in this test, not path.
>> Thanks.
> 
> Still needs tweaking btw.

Done.

>>> WorkingTree.get_file_text() now has a filtered=True optional parameter. I
>>> suspect you need it as well for content filtering to work as expected
>>> on nested trees.
>> It seems we don't have any operations that do content filtering yet.
> 
> We have eol support and keyword templating, the latter via a plugin.

What I mean is, no tests fail without this parameter, so CompositeTrees
don't seem to be used by operations that would supply filtered=True.

> The fix is simple: added filtered=True to the parameter list and
> propagate the setting to the tree.get_file_text() call. Without
> this, I promise you that eol conversion and keywords will not work
> in nested trees.

I assume that there is adequate testing of text filtering.  So since no
tests fail, even when I force maybe_composite to always emit composite
trees, I believe there are no codepaths that could be using composite
trees and filtering.

Some other issues:
 - Tree.get_file_text does not take a 'filtered' parameter, so this
   requirement is not part of the Tree interface
 - RevisionTree.get_file_text does not take a 'filtered' parameter, so
   propagating 'filtered' to the actual tree (which may be a
   RevisionTree) is problematic.
 - There is no failing test, so there is no insurance that the change is
   correct or that it won't regress.

Since I can't honour this suggestion, I'm treating this review as a
resubmit.

>> +    def get_tree_and_treepath(self, file_id, skip_references=False):
>> +        """Get a tree containing file_id and path relative to the root tree.
>> +
>> +        The path is the path to the subtree, not the file in the subtree.
>> +
>> +        :skip_references: if True, only results where the file_id is not
>> +            a tree reference will be supplied.
>> +        """
>> +        try:
>> +            tree, branch, path = self.id_tree_branch[file_id]
>> +        except KeyError:
>> +            for path, tree in self.all_trees().iteritems():
>> +                if file_id in tree.inventory:
>> +                    if skip_references:
>> +                        try:
>> +                            if tree.kind(file_id) == 'tree-reference':
>> +                                continue
>> +                        except errors.NoSuchFile:
>> +                            pass
>> +                    return tree, path
>> +            else:
>> +                raise errors.NoSuchId(self, file_id)
>> +        else:
>> +            return tree, path
> 
> I think you want to be caching the results in id_tree_branch after finding
> them, i.e. just before the return in the except clause. Otherwise, when does
> id_tree_branch ever get populated?

Since id_tree_branch is supposed to contain a branch, this is not
possible.  The way branches are handled needs re-working anyhow, so I've
taken the cache out for now.

>> +    def iter_pathinfo_by_tree(self, paths):
>> +        """Iterate through the path information for provided paths.
>> +
>> +        iterates through (tree, tree_path, relpaths) tuples, where
> 
> Very minor - s/iterates/Iterates/ at the start of sentence.

Fixed.

> 
>> +    def all_trees(self):
>> +        """Return a dict of composite_path to tree for all subtrees."""
>> +        if not self._all_trees_scanned:
>> +            self._scan_subtrees()
>> +        self._all_trees_scanned = True
> 
> While not strictly necessary, better to put that last statement inside the
> if block.

Okay.

>> +    def __init__(self, root_tree, root_branch):
>> +        self._nested_trees = NestedTrees(root_tree, root_branch)
>> +        self.inventory = _CompositeInventory(self._nested_trees)
> 
> I think it might be nice to make root_branch default to root_tree.branch
> here. We could also do the same in NestedTree.__init__().

That would break when RevisionTrees are supplied, so I don't think it
would be nice.

> 
>> +            pairs = []
>> +
>> +            changes = []
>> +            for num, (from_path, from_relpath, entry, from_tree) in\
>> +                enumerate(paths_info[:-1]):
>> +                from_inv = from_tree.inventory
> 
> I'd prefer the blank line before pairs = [], not after it.

Done.

> from_inv is never used so the last line can be deleted.

Done.

> 
>> +            if not after:
>> +                trans_id_tree_path = {}
> 
> I'd like to see everything inside this if statement moved into a separate
> method. move() is too large to be easily understandable and that looks as
> good a place as any to partition it.

Done

>> +    def get_file_text(self, file_id, path=None):
>> +        """See Tree.get_file_text."""
>> +        tree, relpath = self._find_tree_path(file_id, path)
>> +        return tree.get_file_text(file_id, relpath)
> 
> As mentioned above, still need to define and propagate a parameter called
> filtered.

Not done, as explained above.

>> +    def get_file_sha1(self, file_id, path=None, stat_value=None):
>> +        """See Tree.get_file_sha1."""
>> +        return self._nested_trees.get_tree(file_id).get_file_sha1(
>> +            file_id, None, stat_value)
>> +
>> +    def get_file_mtime(self, file_id, path=None):
>> +        """See Tree.get_file_mtime."""
>> +        return self._nested_trees.get_tree(file_id).get_file_mtime(
>> +            file_id, None)
> 
> Wouldn't it be better to use find_tree_path() in these two routines as
> you do for is_executable?

Done.

> 
>> +        for old_path, new_path, file_id, new_entry in changes:
>> +            if new_path is not None:
>> +                if new_entry.kind == 'tree-reference':
>> +                    raise ValueError('Cannot introduce a tree-reference in'
>> +                         ' CompositeTree.apply_inventory_delta.')
> 
> 
> IIUIC, this test will also stop changing a tree-reference, not just
> introducing one.

Tree references may not appear for any reason in the tree deltas
accepted by CompositeTree.

> I suspect we want to allow changes. If so, you need
> to move this test. If not, we need to reword the error to say
> "Cannot introduce or change ...".

Done.

> 
>> +                new_root_id = new_tree.get_root_id()
> 
> Never used and can be deleted.

Done.

> 
>> === added file 'bzrlib/tests/test_composite_tree.py'
> 
>> +class TestNestedTrees(tests.TestCaseWithTransport):
>> +
>> +    def make_branch_and_tree(self, path, format='dirstate-with-subtree'):
> 
> Any reason not to use a more recent format here, e.g. pack-0.92-subtree?

No.

> 
>> +            fullpath, relpath, found_entry, found_tree = \
>> +                nested.paths_info(['sub/subb'])[0]
>> +        finally:
>> +            nested.unlock()
>> +        self.assertIs(None, found_entry)
>> +        self.assertIsNot(None, found_tree)
> 
> I'd like a one-line comment above this bit explaining that you're now
> testing a missing path.

Done.  But it's actually an unversioned, non-existent path.  Missing
would be a versioned, non-existent path.

> When I first looked at it, I thought it was
> a redundant copy of the previous line a few lines above it. Also,
> I'd prefer the 2 asserts to come directly after the call to path_info,
> not after the unlock.

Done.

> 
>> +    def test_move_after(self):
>> +        tree = self.make_branch_and_tree('.', format='dirstate-with-subtree')
>> +        self.build_tree(['file'])
>> +        tree.add(['file'], ['file-id'])
>> +        os.unlink('file')
>> +        nested = composite_tree.NestedTrees(tree, tree.branch)
>> +        nested.lock_read()
>> +        try:
>> +            full_path, relpath, entry, found_tree = \
>> +                nested.paths_info('file')[0]
>> +        finally:
>> +            nested.unlock()
> 
> This test looks like junk that ought to go

Done.

> While I'm at it, I'd like a follow-on patch please covering some,
> but not all, of the public methods in NestedTree that don't have
> unit tests yet. This patch is already big enough to make them tweaks!
> To be explicit, I'd like to see these things tested:
> 
> * get_tree_and_treepath() - the tree_references param in particular
> * iter_pathinfo_by_tree() - the ordering issue in particular
> * get_tree_branch() - the locking logic in particular.

Will do.


> 
>> +    def test_iter_entries_by_dir(self):
> 
>> +    def test_iter_entries_by_dir_follow_subtree(self):
> 
>> +    def test_inventory(self):
> 
>> +    def test_comparison_data(self):
> 
> All of these tests can be simplified by using get_locked_composite().

Done.

> 
>> +    def test_apply_inventory_delta(self):
> 
> This is a large test and I suspect reviewers like Vincent would
> ask for it to be broken up into smaller tests. I'm less of a
> TDD guru and I'm happy for one test to cover multiple things and
> for it to leverage evolving state as you're doing. However, I do
> like to see whitespace and one-line comments used to break it up
> logically with a brief explanation about what each batch of
> statements is testing. Please.

Done.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknwwiUACgkQ0F+nu1YWqI1FwQCdGAPzEgR2CpElC64YCM2gvj3k
Ho8AmgJYK0te2AttQAbK2LM+rdMagV/P
=Fdv8
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: composite4.patch
Type: text/x-patch
Size: 172717 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090423/3a95a792/attachment-0001.bin 


More information about the bazaar mailing list