[MERGE] Make push/pull/merge faster.

Tue Oct 30 18:37:09 GMT 2007

On Tue, 2007-10-30 at 13:15 -0500, John Arbash Meinel wrote:

> You seem to set "supports_ghosts = True" for KnitRepo, but not for PackRepo. It
> may be inherited, but setting it explicitly would be good.

Hmm, I was deliberately not setting it for PackRepo.

> +        if not find_ghosts and revision_id is not None:
> +            graph = self.source.get_graph()
> +            missing_revs = set()
> +            searcher = graph._make_breadth_first_searcher([revision_id])
> +            target_index = \
> +                self.target._pack_collection.revision_index.combined_index
> +            null_set = frozenset([_mod_revision.NULL_REVISION])
> +            while True:
> +                try:
> +                    next_revs = set(searcher.next())
> +                except StopIteration:
> +                    break
> +                next_revs.difference_update(null_set)
> 
> ^- Why do you need to remove the null revisions? Just so you don't query the
> target index for them?

Yes. Searching for 'null:' when it will never be there is wasteful.

> +                target_keys =list ((key,) for key in next_revs)
> 
> ^- This has a PEP8 typo, but even more, it seems odd to use a list on a
> generator, than just doing a list comprehension

Bah. Debug code wins again (printing generators is not useful).

> target_keys = [(key,) for key in next_revs]
> 
> 
> +                have_revs = frozenset(node[1][0] for node in
> target_index.iter_entries(target_keys))
> 
> ^- This is an overly long line.
> 
> have_revs = frozenset(node[1][0] for node
>                       in target_index.iter_entries(target_keys))
> or something along those lines. (we each seem to have a slightly different
> preference when wrapping)

Yah, missed that one - thanks.

> 
> +                missing_revs.update(next_revs - have_revs)
> +                searcher.stop_searching_any(have_revs)
> +            return missing_revs
> 
> 
> The doc string for "iter_entries" has:
>         :return: An iterable as per iter_all_entries, but restricted to the
>             keys supplied. No additional keys will be returned, and every
>             key supplied that is in the index will be returned.
> 
> It isn't perfectly clear that this will skip requested entries that are not
> present. Or maybe it does, but it takes a long time to get there.
> 
> :return: Iterate over tuples (like iter_all_entries) for all keys which are
> present. Keys not present will be skipped, keys present but not requested will
> not be returned.

Hmmm, I'm not sure this is more clear to be honest. And saying that the
return value is 'Iterate' is IMO wrong - it is an 'Iterable'.

> 
> It also seems weird to have the returned tuples be 3 or 4 based on what is present:
>         :return: An iterable of (index, key, value) or (index, key, value,
> reference_lists).
>             The former tuple is used when there are no reference lists in the
>             index, making the API compatible with simple key:value index types.
>             There is no defined order for the result iteration - it will be in
>             the most efficient order for the index.
> 
> Since you still have the "index" object being returned, it isn't like you can
> do: dict(foo.iter_all_entries). So having it have a fixed return value seems
> better. (Yes this is late to realize, but still seems worthwhile.)

I don't understand what you mean. Do you mean 'having a reference-list
free return value when reference lists are not being used is weird'?

I discussed this way back, basically this is GraphIndex offering a plain
Index interface without requiring a wrapping layer to strip off the 4th
field if it is not being used. I agree that its a touch weird, but it
didn't seem worthwhile having the class and disk signature change just
for the signatures .six file.

> Otherwise having a "fetch_ghosts=True/False" is fine with me.
> 
> BB:tweak

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20071031/ce9b1bc4/attachment.pgp