Proposed new dependency: github.com/juju/errors (and github.com/juju/errgo)

roger peppe rogpeppe at gmail.com
Fri May 30 12:40:43 UTC 2014


On 29 May 2014 21:48, William Reade <william.reade at canonical.com> wrote:
> Masking is often necessary, and information hiding is valuable, and
> depending on spooky action at a distance is generally foolish. But the
> granularity with which we want to track the propagation of a given error is
> *much* finer than the granularity of judicious information hiding: the error
> is likely to pass through a lot more stack frames than it is package
> boundaries, and the former transitions are the ones we need to track to
> properly diagnose errors [0], while the latter are the ones we need to mask
> to properly maintain global sanity [1].

Isn't it better to maintain sanity in this respect at
a function/method level? It is very common to move functions between
packages, and having structures that maintain correctness when
doing this is very useful.

I would argue that by maintaining and documenting error types
at a function level, we make intra-package refactoring easier too.
It's a simple rule: document and notate the kinds of errors that may
be returned by a given function. If it's not documented, don't rely on it.

Package boundaries are fluid over time - functions get exported and unexported
and moved between packages. If we have a firm convention that *all*
functions document and notate their possible error types (some may
of course document that the possible error types are open-ended
and depend on some component that's a parameter), then
we'll have a better and more maintainable system. See below
for an example.

> To mask by default, at every annotation point, leads quickly to code whose
> action is characterised by bizarre and mystifying errors [2].

I don't get this at all. Bizarre and mystifying errors are, in my experience,
caused by errors that are not sufficiently annotated, not errors without
a diagnosable cause. Masking by default has no impact on the former;
it changes only the latter.

> To *fail* to
> mask at sensible boundaries also leads to bugs, and we've seen one or two of
> those as well; the trouble is that masking is *always easy*, while pure
> annotation is (relatively) hard.

On the contrary, if everyone now uses errors.Annotate, masking is harder
because that preserves the cause by default. To mask the cause, you'd have to
explicitly do:

    return errors.Wrap(errors.Annotate(err, "annotation"), nil)

This is not only considerably more verbose - it adds another
unnecessary level to the error stack.

-----------------------------------------------------------------------

I'll try to illustrate the points I am trying to make more concretely
by referring to the juju-core source.

For example, see this line of code at
http://bazaar.launchpad.net/~go-bot/juju-core/trunk/view/head:/environs/imagemetadata/generate.go#L43
:

    existingMetadata, _, err := Fetch(
        []simplestreams.DataSource{dataSource},
simplestreams.DefaultIndexPath, imageConstraint, false)
     if err != nil && !errors.IsNotFound(err) {
           return nil, err
     }

If the condition is triggered, the code falls back to existing metadata.
The question is: under what circumstances will the condition
(errors.IsNotFound(err))
be triggered - is this code actually correct?

The only way to find out is to exhaustively inspect the code paths reachable
from the Fetch call, truncating the search at any return that obviously
cannot return a NotFound error.

I did that. Here's a summary of what I found: http://paste.ubuntu.com/7551096/

As far as I can see, the circumstances in which this condition can be triggered
are:
a) if a DataSource.Fetch call fails for any reason.
b) if simplestreams.IndexReference.GetProductsPath returns NotFound.

These both seem somewhat reasonable (although I'm reasonably sure
that a network failure should not trigger the "not found" behaviour").

But from readMetadata to GetProductsPath are 5 levels of call stack,
4 of which are exported functions, *none of which document or test
any of their error return possibilities*.

    "environs/imagemetadata".readMetadata
    "environs/imagemetadata".Fetch
    "environs/simplestreams".GetMetadata
    "environs/simplestreams".getMaybeSignedMetadata
    "environs/simplestreams".IndexReference.GetCloudMetadataWithFormat
    "environs/simplestreams".IndexReference.GetProductsPath

If any of those intermediate functions are unwittingly changed to mask
the NotFoundError or to call a function that may return a NotFoundError,
our code will be broken.

Note that the only reason it was actually feasible to do this analysis
*at all* was because there are many places where the underlying error was
masked, enabling me to truncate the search. With pervasive non-masking
of errors, as suggested, this will no longer be true. The juju/errors changes
will actually make things worse (from a logic-verifiability point of view)
than the current situation.

If someone changes simplestreams.SeriesVersion to return a NotFoundError
when the series is not found (a superficially reasonable change) it will
break code. No-one could be expected to know that or to call it out in
a review, because it's entirely opaque.

That's an example of how our current practices, and the
practices that github.com/juju/errors seeks to make idiomatic,
make things less maintainable. I believe that systematic use
of errgo could improve this kind of code greatly.

  cheers,
    rog.



More information about the Juju-dev mailing list