[MERGE] Fix RemoteRepository.get_parent_map stacking

John Arbash Meinel john at arbash-meinel.com
Tue Nov 18 16:46:40 GMT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> Hi all,
> 
> This patch fixes a significant bug in RemoteRepository stacking:
> get_parent_map did not make use of fallback repositories.
> 
> This meant that a push could easily wind up pushing almost the entire
> history into a stacked branch, even though the stacked_on branch already
> had that history.  This was probably the root cause of bug #293679.
> 
> Aaron
> 

- -
+
     def get_graph(self, other_repository=None):
         """Return the graph for this repository format"""
- -        parents_provider = self
+        parents_provider = self._make_parents_provider()
         if (other_repository is not None and
             other_repository.bzrdir.transport.base !=
             self.bzrdir.transport.base):
@@ -882,7 +882,10 @@
         self._ensure_real()
         return self._real_repository._fetch_reconcile

- -    def get_parent_map(self, keys):
+    def get_parent_map(self, revision_ids):
+        return self._make_parents_provider.get_parent_map(revision_ids)
+
+    def _get_parent_map(self, keys):

^- I'm pretty positive that you need an extra set of parentheses here:
self._make_parents_provider().get_parent_map(revision_ids)

Which also indicates that this code path isn't actually being tested.

It also seems a bit circuitous to have get_parent_map() return a locally
defined class whose only purpose is to call _get_parent_map().

I would say that:

1) We really don't want a locally defined class, as that causes the
class to have to be recompiled every time we call
_make_parents_provider, and as you are doing that anytime we call
Repository.get_parent_map() it seems a bit much.

Probably fine to just define it as an extra module-level class (prefixed
with _), and then pass in the repository as part of the constructor.

2) How much overhead does this introduce versus the fairly common case
of no fallback repositories? Is it worth just doing:

if not self._fallback_repositories:
  return self

Or possibly
if not self._fallback_repositories:
  return ParentsProvider(self)

If you feel we really do need the extra indirection.


3) Updating the docstring for "_get_parent_map()" to indicate that this
is "get parents from only this repository and no stacked on locations"
would probably be a reasonable way to clarify how it is different from
the existing functions.

4) It seems like something that should be codified into a test,
something like stacked_get_parent_map that just asserts that
_get_parent_map() doesn't return the values from stacked-on locations.
(More importantly, doesn't *look* at stacked on locations, repeating the
work that is done elsewhere.)

5) I'm not particularly fond of RemoteRepository.get_parent_map() taking
an api that is meant to be fairly lightweight and deal with simple dicts
and tuples, into creating temporary objects to wrap Repository requests
and then get thrown away. It isn't terrible, but I do wonder if it
wouldn't be better overall to just teach RemoteBranch.get_parent_map()
to use (pseudocode):

remaining_keys = requested_keys - found_keys
 for repo in self._fallback_repositories:
   if not remaining_keys:
     break
   found = repo.get_parent_map(remaining_keys)
   result.update(found)
   remaining_keys.difference_update(found)

It isn't a lot of code, and I think it is a bit simpler than having an
indirection through ParentsProvider and _StackedParentsProvider. I may
be wrong, though.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkki8W8ACgkQJdeBCYSNAANJ+ACeIK7GR1CivJ/tXfUxmzFDIdlg
lLAAn0MLanwnwYqYilsbHjTZq3MnZknu
=ySGR
-----END PGP SIGNATURE-----



More information about the bazaar mailing list