errors.ubuntu.com: benefits on Server?

Evan Dandrea evan.dandrea at canonical.com
Thu Jun 6 11:22:05 UTC 2013


Hi Robie,

On 6 June 2013 11:38, Robie Basak <robie.basak at canonical.com> wrote:
>
> I thought I'd split this thread into two. One question is about what
> benefits error reporting on Server might bring, and that's what I'd like
> to discuss here.
>
> > [Daviey] Just need to work out, *if* it is worth doing...
>
> Is it worth doing? What sorts of errors will we pick up on right now? Do
> these errors happen in the real world? With us diverting effort to this,
> are we going to neglect some category of errors that we currently won't
> pick up this way?
>
> Some categories of errors come to mind:

I would be careful in extrapolating based off what you get right now
from Launchpad bugs. This wont just produce the same set of issues in
greater numbers. When you introduce post-release reporting and remove
the barriers to sending reports (no SSO, no forms), you get a much
broader view of what's out there.

I honestly don't know if it's worth doing, but fortunately that's not
entirely my call to make. I also don't think we'll know until we try.

> 1) Segfaults leading to core dumps. I don't see many bug reports of
> these at all. We do get the occasional excellent bug report though. My
> feeling is that segfaults on server are actually quite rare.

You probably don't see reports of these because any process that drops
privileges does not produce a core dump. I really want to turn this
on, making the reports owned by root. Kees sounded okay with it about
a year ago and was going to look into it
(/proc/sys/kernel/setuid_dumpable to 2), but I don't think he ever
found the time.

I can bring this back up with the security team.

> 2) Maintainer script failures. We get lots of these reports. Most of
> them are due to local misconfiguration or sysadmin error in a way that I
> don't think it's possible or reasonable for us to fix. This is because
> most use cases of server packages involve sysadmin configuration file
> editing. Perhaps this can be fixed at a higher layer (eg. charms being
> careful to not introduce these kinds of errors). I think these reports
> aren't useful individually for this reason, but may be useful in
> aggregate to identify real bugs. So it seems to me that error reporting for
> these would be really useful.

We've had some difficulty finding a good way of bucketing these
together, so that you see them in aggregate. Martin, Brian, and I sat
down at the client sprint and came up with what I hope is a better
solution:

https://lists.ubuntu.com/archives/raring-changes/2013-June/010150.html

> 3) Perhaps daemon start failures that aren't from maintainer script
> failures? This is subject to the same sysadmin misconfiguration problem
> above though; I'm not sure how useful this would be.
>
> What other kinds of errors will we initially pick up on? What categories
> have I missed, or does anyone disagree with my analysis above?

Recoverable problems. We created /usr/share/apport/recoverable_problem
for applications that can handle an exception, but would still like to
know that it occurred. Often, this may be that the exception was
handled but the experience is degraded as a result, so on the desktop
is produces a popup as well.

All developers have to do is feed it nul-separated key-value pairs
over stdin, ensuring they provide a DuplicateSignature field. Right
now it works off the ppid, but I believe Ted Gould was patching it to
support the pid to trace being passed in.

You could patch applications to use this, as Ted is doing with glib.




More information about the ubuntu-server mailing list