[MERGE] Fix #175520 by implementing a --deep option for log

Tue Jan 6 05:54:54 GMT 2009

Vincent Ladeuil wrote:

> This patch provides a --deep option that allows to search history
> with a path instead of a file-id.

I'll make my official vote 'comment' because doing so will
hopefully get some others to provide their thoughts if they
haven't already. If no-one else comments soon, treat my vote
as (borderline) resubmit.

bb: comment

To start with, I'm not that keen on --deep as the option name,
though I can't say I have something necessarily better yet. :-(
To me, this option is all about name-matching, as opposed to
deep searching. In other words, it explicitly logs revisions
that add/delete/change a file matching that path and *not*
previous names of that logical file. So I'd prefer an option
name that emphasized that aspect. (In comparison, --deep
*could* one day be used to ask for deep searching of the
path -> file-id mapping, i.e. look in inventories until
you find the file-id and then log against it.)

You're solving two problems at once here but perhaps they
really ought to have separate solutions?

* --name (say) for name-matching
* --old to search for the most recent file-id with that path.

> This patch prepares some parts to be able to address other log
> related bugs like the ability to specify multiple files or
> requesting the log for a directory.

Cool.

> Since the revisions displayed are different when --deep is used,
> I'd like feedback on the choices made (in particular, unlike the
> file-id based implementation, the revisions toward mainline are
> *not* displayed if they don't reference the file).

I fully expect someone to raise that inconsistency as a bug and,
from a UI perspective, I don't think we can justify it. If it's
important to know the merge revisions when the file exists, it
remains important later even if the file has since been deleted.

I'm hesitant to block this patch on this because I hate it when
reviewers make perfect the enemy of good. Your patch is a step
forward but differences like this are red flags about the
fragility of the implementation.

> Also, there may be concerns about compatibility as '-v' now
> respects the FILE argument in that it doesn't display the *whole*
> inventory anymore. Since it was almost unusable with large
> working trees, I hope the NEWS entry explaining the bug fix is
> enough in that regard.

In this case, I think you've made an improvement. We ought to
list it under CHANGES inside NEWS but it's otherwise fine IMO.

Some more explicit feedback below ...

> +    * ``bzr log`` now accepts a --deep option to look at files not present
> +      anymore in the working tree. Be aware that the actual implementation is

anymore -> any more

> +    * ``bzr log -v FILE`` don't display the whole inventory anymore. Only the
> +      line about FILE is displayed.
> +      (Vincent Ladeuil)

don't -> doesn't

> +            Option('deep',
> +                   help='Search for file in all revisions (slower).'),

I'd like to see the help for the log command expanded quite a bit
as part of this patch. I think it needs to explain file-id-based vs
file-path-based logging and include examples of both.

I wonder if we ought to suggest --deep (or its better name if we
can come up with one) when a path isn't found?

> +                for item in files:
> +                    path, file_id, kind = item[:3]
> +                    if (filter is not None and not filter(path, file_id)):

The () around the condition here isn't needed.

>          for (rev_id, revno, merge_depth), rev, delta in revs:
> +            if _path_filter is not None:
> +                try:
> +                    if _filter_delta(delta, _path_filter) == 0:
> +                        continue
> +                finally:
> +                    if not generate_delta:
> +                        # We don't need the delta anymore
> +                        delta = None
>              lr = LogRevision(rev, revno, merge_depth, delta,
>                               rev_tag_dict.get(rev_id))

What does try/finally buy here? Can't you just put the two
if blocks after each other with exactly the same effect?
In fact, it really feels like the checking of path_filter
needs to go higher in the code, as a parameter to either
calculate_view_revisions() or make_log_rev_iterator().

> +def _filter_delta(delta, filter):
> +    """Returns the number of times filter returns True for the given delta."""
> +    matches = 0
> +    for l in (delta.added, delta.removed, delta.kind_changed, delta.modified):
> +        for item in l:
> +            if filter(item[0], item[1]):
> +                matches += 1
> +    for item in delta.renamed:
> +        if filter(item[0], item[2]) or filter(item[1], item[2]):
> +            matches += 1
> +    return matches

I'm guessing you're looking to reuse this function later? Even so,
you may well be better off just returning match yes/no as opposed
to counting matches, because the former is easier to optimise. For
example, you could return as soon as any match is found and search
modifications before the other lists, say.

> +    # the file. Note that this function use the fact that we create a new text
> +    # only when a file is created or modified. We don't need to access every

use -> uses

> +            # This needs to be logged, along with the extra revisions It's
> +            # important to understand that the test above identify all the

identify -> identifies

> +    def test_log_file_not_deep_short(self):
> +        log = self.run_bzr(['log', '-v', '--short',  'f2'])[0]
> +        self.assertNotContainsRe(log, 'A  f1\n')
> +        self.assertNotContainsRe(log, '  f1\n')
> +        self.assertContainsRe(log,  '  f2\n')
> +
> +    def test_log_file_not_deep_short(self):
> +        log = self.run_bzr(['log', '-v', '--deep', '--short',  'f2'])[0]
> +        # revnos 1 and 3 appear and mention adding f2
> +        self.assertContainsRe(log, '(?sm)^ +3 joe.*A  f2\n.*^ +1 joe.*A  f2\n')
> +        # revno 2 appears and mentions deleting f2
> +        self.assertNotContainsRe(log, '(?sm)^ +2 joe.*D f2\n')

Oops - two methods with the same name. Don't you just love how
that's a "feature" in Python? :-) I think the second one needs
to drop the "not_" bit. Also the last comment doesn't match the
code does it? Didn't you mean assertContainsRe there???

> +    def test_log_old_file_not_deep(self):
> +        # Without deep, we can't query an old file
> +        log = self.run_bzr(['log', '-v', 'f1'], retcode=3)[0]

As mentioned previously, perhaps we ought to suggest --deep
to the user in this case?

> +    def test_log_old_file_deep(self):
> +        log = self.run_bzr(['log', '-v', '--short', '--deep', 'f1'])[0]
> +        # revnos 1 and 2 appear and mention adding and deleting f1
> +        self.assertContainsRe(log, '(?sm)^ +2 joe.*D  f1\n.*^ +1 joe.*A  f1\n')
> +        self.assertNotContainsRe(log, '(?sm)^ +3 joe')

Hmmm, this is testing exactly the same parameters as
"test_log_file_deep_short". Maybe this test ought to
check the output when -v is not given instead?

In summary, thanks for working on this - getting a
rich-featured, bug-free, high-performing log command
is quite a challenge for us given our UI desires.
I like most of the refactoring you did but feel this
patch isn't ready for merging just yet, at least not
without a bit more discussion re the UI and implementation
alternatives.

Ian C.