[MERGE][bug #153786] Partial fix for retrying if a pack file disappears

John Arbash Meinel john at arbash-meinel.com
Tue Oct 28 17:39:54 GMT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> Martin Pool has voted tweak.
> Status is now: Conditionally approved
> Comment:
> Thanks for fixing this.
> 
> I wonder if we should do something to prevent it looping indefinitely while
> retrying, but that's probably unlikely and yagni; we can deal with it if it
> arises.

So the 'reload()' function returns False if reloading didn't actually
find anything new, and I also explicitly tested the case of a file
actually being gone and nothing we can do about it. So the code will try
a reload, see it didn't get anything to work with, and raise the
original exception.

So *if* we had an extremely long running operation (say downloading all
of mysql over a 9600 baud modem), and the repository was very active (it
was a shared repo between all of mysql's developers). You certainly
could end up hitting cases where you keep reloading the indexes.
However, there are a few other things to consider:

1) If they aren't running 'bzr pack' then only the small pack files are
being compacted via autopack.

2) In general, we will end up looking for a specific revision id, and
the history beyond that point. We won't continually grab newer revisions.

3) The auto-packing process should eventually stablize such that all the
old stuff past a certain point is contained in a single pack file. So
even though behind the scenes new packs are being added (at an alarming
rate), the current action will just start to ignore the churn.

We could add some sort of "don't retry more than X times". Though then
you might run into it with a long running process like
tortoisebzr/olive/eclipse. Unless you scope the retry appropriately, but
then you have to pass around the retry value.

If you really want, the simple fix is to change the "while True" portion
of the iterator to somthing like:
for retry_count in xrange(10):
  ...
else:
  raise RetryiedUntilICouldRetryNoMore()


> 
> The implementation looks correct and nicely tested.
> 
> === modified file 'NEWS'
> --- NEWS        2008-10-21 20:49:54 +0000
> +++ NEWS        2008-10-23 21:06:43 +0000
> @@ -44,6 +44,11 @@
>        could only happen if ``bzr reconcile`` decided that the parent
>        ordering was incorrect in the file graph.  (John Arbash Meinel)
> 
> +    * The index logic is now able to reload the list of pack files if and
> +      index ends up disappearing. We still don't reload if the pack data
> +      itself goes missing after checking the index.
> +      (John Arbash Meinel, #153786)
> +
>    DOCUMENTATION:
> 
> 
> To make it more clear which bug this is maybe add
> 
>   This fixes where operations on a repository transiently fail with
> 'file not found'
>   when another process is writing to that repository.
> 
> or something similar.

I changed it to:
* The index logic is now able to reload the list of pack files if and
  index ends up disappearing. We still don't reload if the pack data
  itself goes missing after checking the index. This bug appears as a
  transient failure (file not found) when another process is writing
  to the repository.  (John Arbash Meinel, #153786)

Sound reasonable?

> 
>    API CHANGES:
> 
> === modified file 'bzrlib/index.py'
> --- bzrlib/index.py     2008-09-21 14:48:37 +0000
> +++ bzrlib/index.py     2008-10-23 20:53:50 +0000
> @@ -27,6 +27,7 @@
>  from bisect import bisect_right
>  from cStringIO import StringIO
>  import re
> +import sys
> 
>  from bzrlib.lazy_import import lazy_import
>  lazy_import(globals(), """
> @@ -1106,12 +1107,16 @@
>      in the index list.
>      """
> 
> -    def __init__(self, indices):
> +    def __init__(self, indices, reload_func=None):
>          """Create a CombinedGraphIndex backed by indices.
> 
>          :param indices: An ordered list of indices to query for data.
> +        :param reload_func: A function to call if we find we are
> missing an
> +            index. Should have the form reload_func() => True/False to
> indicate
> +            if reloading actually changed anything.
> 
> It wasn't clear to me from this description that 'changed anything'
> means 'we
> have a chance to now find the indexes'.  I thought maybe you meant if it
> changed some cached state.

:param reload_func: A function to call if we find we are missing an
    index. Should have the form reload_func() => True if the list of
    active pack files has changed.

ATM I also set True if the only change is a removal of a pack file.
Which may actually be reasonable for "CombinedGraphIndex" since at that
point the other indexes could be redundant and just not removed yet. (So
we just want to skip what they might have contained.)

> 
> +    def _reload_or_raise(self):
> +        """We just got a NoSuchFile exception.
> +
> +        Try to reload the indices, if it fails, just raise the current
> +        exception.
> +        """
> +        if self._reload_func is None:
> +            raise
> +        exc_type, exc_value, exc_traceback = sys.exc_info()
> +        if not self._reload_func():
> +            # We tried to reload, but nothing changed, so we fail anyway
> +            raise exc_type, exc_value, exc_traceback
> 
> I would have put a guarded mutter statement here.
> 
> -- 
> Martin

I'm not sure what you mean by "guarded" but I'm happy to log something.

=== modified file 'bzrlib/index.py'
- --- bzrlib/index.py     2008-10-23 20:53:50 +0000
+++ bzrlib/index.py     2008-10-28 17:38:59 +0000
@@ -1268,8 +1268,12 @@
         if self._reload_func is None:
             raise
         exc_type, exc_value, exc_traceback = sys.exc_info()
+        trace.mutter('Trying to reload after getting exception: %s',
+                     exc_value)
         if not self._reload_func():
             # We tried to reload, but nothing changed, so we fail anyway
+            trace.mutter('_reload_func indicated nothing has changed.'
+                         ' Raising original exception.')
             raise exc_type, exc_value, exc_traceback

     def validate(self):

I think it is reasonable to mutter when a reload occurs, and then
muttering if it fails as well is a reasonable thing.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkkHTmoACgkQJdeBCYSNAAOuMwCfaNc/2pD3qxkvv5CtN2X1S9BS
lSgAnRRUd6AADVxmP0JFamunKZed5jGg
=4/eZ
-----END PGP SIGNATURE-----



More information about the bazaar mailing list