[MERGE, 165306] Use the per-file graph during commits of pack's
Robert Collins
robertc at robertcollins.net
Tue Nov 27 22:30:43 GMT 2007
On Tue, 2007-11-27 at 17:15 -0500, John Arbash Meinel wrote:
> John Arbash Meinel has voted tweak.
> Status is now: Conditionally approved
> Comment:
> === modified file 'bzrlib/index.py'
> ...
>
> + def get_parents(self, revision_ids):
> + """See StackedParentsProvider.get_parents.
> +
> + This implementation thunks the graph.Graph.get_parents api
> across to
> + GraphIndex.
> +
> + :param revision_ids: An iterable of graph keys for this graph.
>
> ^- This shouldn't be called revision_ids if it is actually graph keys.
>
> I found it hard to sort out how it was working when you were later on
> using (file_id, revision_id) tuples.
Theres an API problem though; the 'interface' uses revision_ids. I don't
know of any callers doing get_parents(revision_ids=xxx), but for
compatibility I wanted to have that work.
> Maybe call it:
>
> def get_parents(self, node_keys):
> ...
> :param node_keys: An iterable of graph keys for this graph.
> (eg. for a simple revision graph, this will be revision ids, for
> a
> CombinedGraph this will be (file_id, revision_id) tuples.)
>
>
> This also means that your later code:
> search_keys = set(revision_ids)
> search_keys.discard(NULL_REVISION)
> found_parents = {NULL_REVISION:[]}
>
> Is probably bogus. Because (file_id, NULL_REVISION) will not be found.
Actually, NULL_REVISION is used as a constant, - (NULL_REVISION,) in a
parents list, which leads to NULL_REVISION as a key, so no - it's not
bogus :). And we can't do ((NULL_REVISION, NULL_REVISION),) because the
graph code wants NULL_REVISION as the constant for 'end of graph.
> I also wonder about the logic of returning (NULL_REVISION,) as the
> parents
> of a (file_id, revision_id) pair.
It's part of the contract of graph.
> I also notice that "Combined_Graph.iter_keys says:"
> :return: An iterable of (index, key, reference_lists, value).
> There is no
> defined order for the result iteration - it will be in the
> most
> efficient order for the index.
> but you are looping on:
> for index, key, value, refs in self.iter_entries(search_keys):
>
> I'm guessing the comment for Combined_Graph.iter_entries is wrong.
I'll correct that. references_lists is indeed after value.
> Is there a way to reduce the duplication in __init__ for
> PackCommitBuilder
> and PackRootCommitBuilder? They both have the same __init__ and the same
> implementation of heads().
Yah, I didn't find a good one when I wrote it; and as you haven't
proposed one I think we can leave it for now.
> I don't see any direct tests for CommitBuilder.heads(). Which seems like
> they should be in
> bzrlib/tests/repository_implementations/test_commit_builder.py
> I only see heads tests for test_graph.py and test_inv.py
So, heads() is very heavily, indirectly, tested by the tests for
last-modified generation. I'm in two minds here about what to do.
One way would be to migrate those tests to tests for heads(), and a test
that heads() is called and used correctly. Another is to consider
heads() an implementation internal of the builder (and perhaps make it
_heads, though as subclasses need to implement it thats in our grey area
for how things are defined as public/private).
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20071128/bc431ee7/attachment.pgp
More information about the bazaar
mailing list