[MERGE] Batch get_parent_map HPSS calls made from InterRepo._walk_to_common_revisions

John Arbash Meinel john at arbash-meinel.com
Wed Oct 1 15:11:49 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


...

> So maybe we should be returning some data for HPSS's get_parent_map if
> the answer would be empty.  As a hypothetical idea, if the remote repo
> could efficiently generate a list of heads, then returning those (or
> subset of them up to some fixed limit?) would be useful.
> _walk_to_common_revisions could in principle be implemented as:

I'll note that Robert did work a while ago on searching local and remote
graphs to find areas of overlap. I don't know where that ended, but it
was certainly something we discussed when I was in Sydney last (2 yrs ago?)


> But even returning some random revisions (say the ones that the remote
> repo happened to open anyway in the course of doing the server-side
> bisection) would help.  We should do something; get_parent_map via HPSS
> really shouldn't be at a disadvantage to a dumber method.
> 
> In the meantime, this patch helps.
> 
> -Andrew.
> 
> 

I think we could simply improve the _walk logic, possibly only under
HPSS conditions, but perhaps for all cases.

I believe the logic he outlined back then was about logarithmic scaling.
So you would query for tip, tip-1, tip-2, tip-4, tip-8, tip-16, tip-32,
etc. You could certainly cap that at 50 revisions which would be 2^50
and still only request 50 revision ids and always either know that you
have 0 overlap or the server could find something to give back.

And if it hit between tip-16 and tip-32, it could fan forward inbetween
there.


+            next_revs = set()
+            while len(next_revs) <
self._walk_to_common_revisions_batch_size:
+                try:
+                    next_revs_part, ghosts = searcher.next_with_ghosts()
+                    next_revs.update(next_revs_part)
+                except StopIteration:
+                    run_out_of_next_revs = True
+                    break
+                if revision_ids.intersection(ghosts):
+                    absent_ids = set(revision_ids.intersection(ghosts))
+                    # If all absent_ids are present in target, no error is
+                    # needed.
+                    absent_ids.difference_update(
+                        set(target_graph.get_parent_map(absent_ids)))
+                    if absent_ids:
+                        raise errors.NoSuchRevision(
+                            self.source, absent_ids.pop())


^- The "if revision_ids.intersection(ghosts)" looks more like something
that should be outside the while loop, rather than inside. But maybe I'm
wrong.

John
=:->

BB:comment
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjjhSUACgkQJdeBCYSNAAOaIACePgradGvxNgpvQgCBC6wRLX5O
HgcAn0tRXgBGLlkxu50UnpJoYJ+N66X6
=woz/
-----END PGP SIGNATURE-----



More information about the bazaar mailing list