[MERGE] More deprecations...

Thu Mar 27 05:45:16 GMT 2008

Robert Collins wrote:
>     * ``Repository.get_revision_graph`` is deprecated, with no replacement
>       method. The method was size(history) and not desirable. (Robert Collins)
> 
>     * ``revision.revision_graph`` is deprecated, with no replacement function.
>       The function was size(history) and not desirable. (Robert Collins)
> 
>     * ``VersionedFile.get_graph`` is deprecated, with no replacement method.
>       The method was size(history) and not desirable. (Robert Collins)

bb:resubmit

Or perhaps it should be "bb:justify"...

I'm uncomfortable with all the duplicated code you've added to replace the calls
to deprecated methods.  It feels like a step backward to replace two simple
lines with ten.  I realise the point to some extent is to make it harder to
accidentally do work proportional to size(history), but we still have several
places in the code that do, and making them uglier without otherwise improving
them seems like a backwards step to me.  One of the comments hints that the
ugliness is temporary, at least in one case... if that's so, then I think I can
live with this, but comments at each site saying so would be good.

Basically, I'm sure you have a good reason for wanting to duplicate this code
all over the place, but you're going to have to explain it to me :)

Further comments inline, mostly repeats of the above concern.

> === modified file 'bzrlib/branch.py'
[...]
> -        revision_graph = self.repository.get_revision_graph(last_revision)
> +        graph = self.repository.get_graph()
> +        search = graph._make_breadth_first_searcher([last_revision])

Hmm, so for places that really do need to replace get_revision_graph, they have
to use _private methods on the graph object?  That doesn't feel right.  Perhaps
it's time the searcher got promoted to a public interface?  It doesn't seem like
something that should be private to bzrlib, especially if it's the best
alternative to the whole history operations you've deprecated.

> +        transitive_ids = set()
> +        map(transitive_ids.update, list(search))

I'd rather this was written as:

    for search_result in search:
        transitive_ids.update(search_result)

The reason why I dislike your version is that map's purpose is to construct a
list, but here you're using it for its side-effect of calling a function for
every item in a list, and then ignoring the return value.  If there's
significant performance win by using map then leaving it as is fine, but in that
case there should be a comment justifying doing things an unusual way.

[...]
> === modified file 'bzrlib/knit.py'
[...]
> -        graph = self.source.get_graph(version_ids)
> -        order = topo_sort(graph.items())
> +        # --- the below is factorable out with VersionedFile.join, but wait for
> +        # VersionedFiles, it may all be simpler then.
> +        graph = Graph(self.source)
> +        search = graph._make_breadth_first_searcher(version_ids)

Here's another use of _make_breadth_first_searcher.  Just highlighting this
because it suggests that it should public, as I mention above.

> === modified file 'bzrlib/log.py'
[...]
> -    rev_graph = branch.repository.get_revision_graph(mainline_revisions[-1])
> -    sorted_rev_list = topo_sort(rev_graph)
> +    graph = branch.repository.get_graph()
> +    # This asks for all mainline revisions, which means we only have to spider
> +    # sideways, rather than depth history. That said, its still size-of-history
> +    # and should be addressed.
> +    search = graph._make_breadth_first_searcher(mainline_revisions)
> +    transitive_ids = set()
> +    map(transitive_ids.update, list(search))
> +    parent_map = graph.get_parent_map(transitive_ids)
> +    sorted_rev_list = topo_sort(parent_map.items())

This is the third time you've replaced a get_graph call with this idiom
(a "transitive_ids" set updated with every search result from a breadth first
search).  I'm a bit uncomfortable about that.  I guess you don't want to refactor
out a common function for this, because deprecating whole-of-history operations
is the point of this change.  But if they're just going to be done anyway in a
more complicated way, it seems like a bit of a step backwards.

> +    graph = branch.repository.get_graph()
> +    # This asks for all mainline revisions, which means we only have to spider
> +    # sideways, rather than depth history. That said, its still size-of-history
> +    # and should be addressed.
> +    search = graph._make_breadth_first_searcher(mainline_revs)
> +    transitive_ids = set()
> +    map(transitive_ids.update, list(search))
> +    parent_map = graph.get_parent_map(transitive_ids)

And same again...

> +    # filter out ghosts; merge_sort errors on ghosts.
> +    rev_graph = {}
> +    # Filter ghosts, and null:
> +    if NULL_REVISION in parent_map:
> +        del parent_map[NULL_REVISION]
> +    for key, parents in parent_map.iteritems():
> +        rev_graph[key] = tuple(parent for parent in parents if parent in
> +            parent_map)

You've added this exact code elsewhere too.  The duplication definitely smells.

> === modified file 'bzrlib/smart/repository.py'
[...]
> -            revision_graph = repository.get_revision_graph(revision_id)
> -        except errors.NoSuchRevision:
> +        graph = repository.get_graph()
> +        if revision_id:
> +            search_ids = [revision_id]
> +        else:
> +            search_ids = repository.all_revision_ids()
> +        search = graph._make_breadth_first_searcher(search_ids)
> +        transitive_ids = set()
> +        map(transitive_ids.update, list(search))
> +        parent_map = graph.get_parent_map(transitive_ids)
> +        revision_graph = {}
> +        if _mod_revision.NULL_REVISION in parent_map:
> +            del parent_map[_mod_revision.NULL_REVISION]
> +        for key, parents in parent_map.iteritems():
> +            revision_graph[key] = tuple(parent for parent in parents if
> +                parent in parent_map)
> +        if revision_id and revision_id not in revision_graph:

And another round of duplication.  I realise this is deprecated code, but that
shouldn't mean it has to be ugly.

> === modified file 'bzrlib/versionedfile.py'
[...]
> -        graph = self.source.get_graph(version_ids)
> -        order = tsort.topo_sort(graph.items())
> +        graph = Graph(self.source)
> +        search = graph._make_breadth_first_searcher(version_ids)
> +        transitive_ids = set()
> +        map(transitive_ids.update, list(search))
> +        parent_map = self.source.get_parent_map(transitive_ids)
> +        order = tsort.topo_sort(parent_map.items())

And for completeness sake, here's another instance of the same code.

-Andrew.