[MERGE][bug #6700] diff on branches without working trees

Aaron Bentley aaron.bentley at utoronto.ca
Tue Nov 20 02:19:34 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ian Clatworthy wrote:
> It also lets you compare two tree-less branches and apply a filter of
> specific files to that comparison if you wish.

bb:resubmit

I'm not really comfortable with the way you're mixing branches and files
in the same argument set.  To me, it seems like if we don't name the
file as part of the branch (e.g.
http://bazaar-vcs.org/bzr/bzr.dev/Makefile), we should have some way of
delimiting files from branches.  Ideally, whatever syntax we use here
should also apply to other commands like status, merge and revert, and
this one doesn't.

One of the reasons why I've never tackled this is that diff is a window
into everything that's wrong with our argument handling:
1. No generic way of specifying multiple branches, and associating
multiple files with each branch

2. Trees and branches are too disconnected; Time and time again, I find
myself needing a branch-and-tree pair.  This was especially true with
Nested Trees.

I just find all the issues a bit overwhelming, so I never start.  You'll
see what I mean as you get into the review.  Don't take it personally--
it's just really hard to support all the functionality we want.

> Some notes for reviewers:
> 
> 1. I *think* the extra_trees handling is still correct and all tests
>    pass. Semantically though, it's not obvious to me whether new_tree
>    ought to be compared to the first or second tree. Does it matter?

What is the "second tree"?

Do you mean whether tree1 should be compared to new_tree or old_tree?

In its most general form, it's:

if tree1 not in (old_tree, new_tree):
    extra_trees.append(tree1)

> 2. I haven't deprecated diff.diff_cmd_helper(). Should I?

It looks like you should.  I doubt it has any other callers.

> 3. I was a little torn between putting "DWIW UI logic" in diff.py
>    instead of builtins.py. In the end, I followed the previous
>    approach (diff_cmd_helper is in diff.py) and kept builtins.py simple.

Keeping DWIW logic separate from the command itself makes it more
testable and possibly more reusable.

+def _get_trees_to_diff(path_list, revision_specs):
+    """Get the trees and specific files to diff given a list of paths.
+
+    The diff command line interface has a degree of DWIW in that the
+    first argument or two may either be paths to branches or paths
+    (directories or files) inside a working tree. This method
+    works out the trees to be diff'ed and the files of interest within
+    those trees. Basically, if the first path is not a file within
+    the current working tree, then the first 2 arguments are assumed to
+    be branch URLs and remaining arguments are the specific files.

That means you can't specify one branch and several files:
$ bzr diff bzr.dev/ NEWS Makefile
bzr: ERROR: Not a branch: "/home/abentley/bzr/NEWS/".

That seems like a pretty serious limitation.

+    :returns:
+        a tuple of (old_tree, new_tree, specific_files, extra_trees) where
+        extra_trees is a sequence of additional trees to search in for
+        file-ids.

That's a lot of return values.  It's almost getting to the stage where I
start to think this should be its own object

+    if path_list is None or len(path_list) == 0:
+        # No arguments - get the current working tree
+        try:
+            tree1 = WorkingTree.open_containing(u'.')[0]

Do we really need a separate clause for "no path supplied"?  I would
think we could pretend the user supplied the tree root, and use the
other code path instead.

+            old_tree = _get_tree_to_diff(old_revision_spec, tree1)
+            new_tree = _get_tree_to_diff(new_revision_spec, tree1,
+                basis_is_default=False)
+            specific_files = None

I'm not sure it makes sense to demand a working tree here.  If there's a
branch, isn't that enough?

+        except errors.NotBranchError:

I think it would be clearer to catch NoWorkingTree here.


+            # If both revision specs include a branch, we can
+            # diff them without needing a local working tree

It seems weird trying to open the working tree first, and then deciding
you don't need it anyway.

+    else:
+        # First argument could be a branch, a working tree or a file in
a WT
+        location1 = path_list.pop(0)

^^^ mutating the input seems surprising to me.  How about using slicing
instead?

+        tree1, branch1, relpath1 = \
+            bzrdir.BzrDir.open_containing_tree_or_branch(location1)

I find tree1 versus old_tree very confusing.  Perhaps call it working_tree?

+        if tree1 is None:
+            old_tree, new_tree, specific_files = \
+                _details_for_diff_given_branch(branch1, path_list,
+                old_revision_spec, new_revision_spec)

I wouldn't think you need a special code path for the branch case.  Why
not just get the basis tree?  In particular, paths that refer to files
in branches (http://bazaar-vcs.org/bzr/bzr.dev/Makefile) are supposed to
work.  But they can't if you don't pass in relpath.

+        elif relpath1 == '':
+            old_tree, new_tree, specific_files = \
+                _details_for_diff_given_tree(tree1, path_list,
+                old_revision_spec, new_revision_spec)
+        else:
+            # File or files in working tree given
+            old_tree = _get_tree_to_diff(old_revision_spec, tree1)
+            new_tree = _get_tree_to_diff(new_revision_spec, tree1,
+                basis_is_default=False)
+            specific_files = _files_for_diff_given_file(tree1, relpath1,
+                path_list)
+
+    extra_trees = None
+    if new_tree != tree1 and tree1 is not None:

+def _get_tree_to_diff(spec, tree, branch=None, basis_is_default=True):
+    if branch is None and tree is not None:
+        branch = tree.branch

^^^ Why does branch default to None, but not tree?

+    if branch:

^^^ Explicit is better than implicit.  Are you checking whether it's
False, None, [], (,) or ''?

I suspect you're testing for None.  In which case, this section is
redundant:

+        revision = spec.in_store(branch)
+    else:
+        revision = spec.in_store(None)

Because if branch is None, spec.in_store(branch) is the same as
spec.in_store(None).

+def _details_for_diff_given_branch(branch1, other_paths, old_revision_spec,
+        new_revision_spec):
+    """Get the trees and files for diff given a branch with no WT first.
+
+    This method parses the remaining arguments on a diff command line given
+    that the first argument has been found to be a branch. In this case,
+    the second argument, if any, must be a branch as well and
+    remaining paths, if any, are the specific files.
+    """
+    # In the absence of revision specs, the contents to compare
+    # depends on the type and count of the things being compared.
+    # We assume content ought to be considered for working trees
+    # and the basis_tree ought to be considered for branches with no WT.
+    old_tree = _get_tree_to_diff(old_revision_spec, None, branch1)
+    if other_paths:
+        location2 = other_paths.pop(0)

I think avoiding mutating the input would be preferable.


+def _details_for_diff_given_tree(tree1, other_paths, old_revision_spec,
+        new_revision_spec):
+    """Get the trees and files for diff given a working tree first.
+
+    This method parses the remaining arguments on a diff command line given
+    that the first argument has been found to be a working tree. In
this case,
+    the second argument, if any, must be a branch or WT as well and
+    remaining paths, if any, are the specific files.
+    """
+    # In the absence of revision specs, the contents to compare
+    # depends on the type and count of the things being compared:
+    #
+    # * If just one working tree was given, we compare its content
+    #   to its basis.
+    # * Otherwise, we assume content ought to be considered for working
trees
+    #   and the basis_tree ought to be considered for branches with no WT.
+    if other_paths:
+        old_tree = _get_tree_to_diff(old_revision_spec, tree1,
+            basis_is_default=False)
+        location2 = other_paths.pop(0)

It would be nice to avoid mutating the input.

The body of this function is virtually identical to
_details_for_diff_given_branch.  Do we really need both?

+def _files_for_diff_given_file(tree, first_relpath, other_paths):
+    """Get the specific files for diff given a path in a WT first.
+
+    This method parses the remaining arguments on a diff command line given
+    that the first argument has been found to be a path in a working tree.
+    In this case, all arguments must be paths in that working tree.
+    """

As I noted above, I think this functionality's scope is too limited.  It
should also support paths within branches that have no files.

It would also be nice to support diffing arbitrary files within multiple
trees.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHQkQ20F+nu1YWqI0RArdWAJwIqIXL7tarpl6PC/3Rj8KYYeTsCACfYGLW
EwJTpr7IMBtFDtQxlpoDbWQ=
=k7jS
-----END PGP SIGNATURE-----



More information about the bazaar mailing list