[MERGE] Make push/pull/merge faster.
Robert Collins
robertc at robertcollins.net
Tue Oct 30 18:37:09 GMT 2007
On Tue, 2007-10-30 at 13:15 -0500, John Arbash Meinel wrote:
> You seem to set "supports_ghosts = True" for KnitRepo, but not for PackRepo. It
> may be inherited, but setting it explicitly would be good.
Hmm, I was deliberately not setting it for PackRepo.
> + if not find_ghosts and revision_id is not None:
> + graph = self.source.get_graph()
> + missing_revs = set()
> + searcher = graph._make_breadth_first_searcher([revision_id])
> + target_index = \
> + self.target._pack_collection.revision_index.combined_index
> + null_set = frozenset([_mod_revision.NULL_REVISION])
> + while True:
> + try:
> + next_revs = set(searcher.next())
> + except StopIteration:
> + break
> + next_revs.difference_update(null_set)
>
> ^- Why do you need to remove the null revisions? Just so you don't query the
> target index for them?
Yes. Searching for 'null:' when it will never be there is wasteful.
> + target_keys =list ((key,) for key in next_revs)
>
> ^- This has a PEP8 typo, but even more, it seems odd to use a list on a
> generator, than just doing a list comprehension
Bah. Debug code wins again (printing generators is not useful).
> target_keys = [(key,) for key in next_revs]
>
>
> + have_revs = frozenset(node[1][0] for node in
> target_index.iter_entries(target_keys))
>
> ^- This is an overly long line.
>
> have_revs = frozenset(node[1][0] for node
> in target_index.iter_entries(target_keys))
> or something along those lines. (we each seem to have a slightly different
> preference when wrapping)
Yah, missed that one - thanks.
>
> + missing_revs.update(next_revs - have_revs)
> + searcher.stop_searching_any(have_revs)
> + return missing_revs
>
>
> The doc string for "iter_entries" has:
> :return: An iterable as per iter_all_entries, but restricted to the
> keys supplied. No additional keys will be returned, and every
> key supplied that is in the index will be returned.
>
> It isn't perfectly clear that this will skip requested entries that are not
> present. Or maybe it does, but it takes a long time to get there.
>
> :return: Iterate over tuples (like iter_all_entries) for all keys which are
> present. Keys not present will be skipped, keys present but not requested will
> not be returned.
Hmmm, I'm not sure this is more clear to be honest. And saying that the
return value is 'Iterate' is IMO wrong - it is an 'Iterable'.
>
> It also seems weird to have the returned tuples be 3 or 4 based on what is present:
> :return: An iterable of (index, key, value) or (index, key, value,
> reference_lists).
> The former tuple is used when there are no reference lists in the
> index, making the API compatible with simple key:value index types.
> There is no defined order for the result iteration - it will be in
> the most efficient order for the index.
>
> Since you still have the "index" object being returned, it isn't like you can
> do: dict(foo.iter_all_entries). So having it have a fixed return value seems
> better. (Yes this is late to realize, but still seems worthwhile.)
I don't understand what you mean. Do you mean 'having a reference-list
free return value when reference lists are not being used is weird'?
I discussed this way back, basically this is GraphIndex offering a plain
Index interface without requiring a wrapping layer to strip off the 4th
field if it is not being used. I agree that its a touch weird, but it
didn't seem worthwhile having the class and disk signature change just
for the signatures .six file.
> Otherwise having a "fetch_ghosts=True/False" is fine with me.
>
> BB:tweak
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20071031/ce9b1bc4/attachment.pgp
More information about the bazaar
mailing list