juju/retry take 2 - looping

roger peppe roger.peppe at canonical.com
Thu Oct 20 17:33:07 UTC 2016


On 20 October 2016 at 16:30, Katherine Cox-Buday
<katherine.cox-buday at canonical.com> wrote:
> John Meinel <john at arbash-meinel.com> writes:
>
>> As commented on the pull request, passing 'err' into Next() initially feels weird, but
>> it allows a big improvement to whether the loop exited because we ran out of
>> retries, or it exited because the request succeeded. (loop.Error()
>> tells us either way.)
>
> I don't understand why the retry logic has the concern of why the loop exits (i.e. Next(error)).
>
> The motivation of moving towards loops was so that the concern of what's being retried is brought back to be inline with the surrounding code. Having the retry mechanism inspecting errors doesn't fall in line with that goal.
>
> In my mind, any retry solutions only needs to cover two things:
> 1. Provide a primitive that will delay a for iteration in a controlled way.
> 2. Be preemptable.
>
> Everything else is the logic of the thing you're retrying, including why it eventually succeeded/failed.

That's well put, and fits well with my thoughts too, thanks.

See also Pawel Stolowski's comment on
https://github.com/juju/retry/pull/5. He's from the Snappy
team who have also been trying to move towards using a standard retry mechanism.
There are many possible reasons for wanting to retry - error values
are just one possibility.

Another thing I'd like to mention is that my retry package suggestion
takes care to keep
loop strategy (the parameters that govern how we intend to retry)
separate from the loop
runtime requirements (the clock and stop channel). This means that
it's perfectly
reasonable to store a loop strategy as a "constant" global variable,
as we do with
testing.LongAttempt for example, and validate the parameters just once at init
time rather than every time an attempt is started.

On a slightly more technical note, the backoff approach used in
github.com/juju/retry
means that it's not possible to have an attempt that adjusts the delays based on
the actual length of time that an attempt took (for example, if an
attempt took 2s
but the backoff function specifies that the delay should be 1s, it may
be appropriate
to wait for 0s or 1s before starting the next attempt). That wouldn't
be hard to fix though.

The lack of a More (HasNext as was) method in the proposed package
also makes the
loop idiom less convenient. If I'm looping trying to acquire a value,
it's easier to
declare the value and the error inside the loop and return the value on success
rather than assigning it outside.

For example, I think it's arguably nicer to do:

    for a := strategy.Start(); a.Next(); {
          someVal, err := something()
          if err == nil || !shouldRetry(err) {
               return someVal, err
          }
          if !a.More() {
               return nil, errors.Annotatef(err, "failed after too
many attempts")
          }
      }

than:

    var err error
    loop := retry.Loop(spec)
    for loop.Next(err) {
        var someVal somepkg.Something
        someVal, err = something()
        if err == nil || !shouldRetry(err) {
             return someVal, err
        }
     }
     return nil, err

In the second example, we already need to check the error value, so
why should we need to pass it to loop.Next?

  cheers,
    rog.



More information about the Juju-dev mailing list