ErrDying and errors.Trace()
John Meinel
john at arbash-meinel.com
Thu Apr 2 11:44:39 UTC 2015
As requested, we wanted to enable leader-elected always, and I ended up
running into an unexpected failure. I tracked it down, and figured I'd
solicit feedback from the group.
The specific code path is:
1. leadership/tracker.go line 67 calls:
t.tomb.Kill(t.loop())
Nothing special here, its how all our workers work.
2. loop() has a common case of line 137:
case ticketCH := <-t.waitMinionTickets:
...
if err := t.resolveWaitMinion(...); err != nil {
return errors.Trace(err)
}
3. resolveWaitMinion: 290 has:
if leader, err := t.isLeader(); err != nil {
return errors.Trace(err)
}
4. And isLeader() 224 has:
case <-t.tomb.Dying():
return false, tomb.ErrDying
All of which seems pretty standard. If we find out we are dying while we
are trying to do something else, return ErrDying. When that bubbles up to
the top, we just call tomb.Kill() with any error that we got so that if we
got a genuine error, a future worker.Wait()/Stop() call can return the
actual error.
However the key point is that calling:
tomb.Kill(tomb.ErrDying)
has special code that looks at:
if reason == ErrDying {
...
return nil
}
t.reason = reason
And what is happening is that because of the errors.Trace() we aren't
handing it ErrDying the singleton error instance, we are handing it an
errors.Err{Cause: tomb.ErrDying}
Which then when we check "worker.Stop() error" we get a real error instead
of nil.
So for now, I decided to trap in NewTrackerWorker and if we have an
errors.Cause() of ErrDying we unwrap it before passing it to tomb.Kill().
If we want to be using errors.Trace more (and it certainly was nice to have
at least some sort of traceback to figure out where these steps were
occurring), shouldn't we be updating our general pattern?
Do we have to wrap every call to tomb.Kill(err) with
tomb.Kill(OnlyRealErrors(err)) ?
Should we (can we?) teach tomb about the errors package so that it can do
its own unwrapping?
uniter.go loop() has its own unwrapping check in the middle of the loop,
but it isn't doing the unwrapping in NewUniter. Should I be doing the
unwrapping inside loop instead? Doesn't that mean I have a lot of code
paths (all of the cases in loop() conceivably could be returning an
ErrDying).
Thoughts?
John
=:->
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju-dev/attachments/20150402/a9d182df/attachment.html>
More information about the Juju-dev
mailing list