story of a failure

Mon Apr 29 10:08:50 UTC 2013

On 29 April 2013 10:57, roger peppe <roger.peppe at canonical.com> wrote:
> It would be possible to add enough logging statements that
> we could work out exactly what was going on at the time
> of a failure, but we would end up producing enormous log files.
> They are big enough as it is - when Dave was doing some scale
> testing recently, 8 hours of running juju (and not doing much
> except starting lots of machines) produced a ~50MB log
> file. If we produce too much logging data, people are going
> to turn off logging and then we won't have anything to go on
> when things do go wrong.

Juju could have a in-memory ring buffer to capture log messages at all
severities. A snapshot would be taken when there's an error and dumped
to a file, somewhat like an OOPS report. That way there's context for
developers, and error messages can remain pithy, more suitable for end
users.