Automatic crash reports in the final release

Fri Mar 30 09:06:08 BST 2007

Hello fellows,

in yesterday's distro team meeting we did not come to a conclusion how
to handle automatic crash reporting in Feisty stable.

Apport currently has two different classes of crashes, which we can
enable/disable/restrict independently from each other:

Unhandled Python Exceptions
===========================

They have no considerable impact on network bandwidth, or CPU/memory
resources when processing them, always have perfect stack traces, and
it should not be hard to develop a tool to automatically mark them as
duplicates.  Personally I find them very helpful, too.

This is also relevant for crashes in Ubiquity:

Mär 29 22:08:26 <cjwatson>      I'll certainly get flooded, *but* I'm going to get flooded *anyway* if the installer is crashing
Mär 29 22:08:39 <cjwatson>      it's either get flooded with decent-quality bugs, or with poor-quality bugs

If we want, we have the option to disable them by default, and
re-enable them in casper, so that we get reports from the live system.

Signal crashes (mostly SIGSEGV)
===============================

We got a lot of them during the Feisty cycle, and we only just
developed some infrastructure to semi-automatically retrace them. This
is still a bit brittle, we often get poor results, bugs have to be
manually tagged, and the current implementation of the retracers takes
a lot of I/O and CPU power in the DC, and thus does not scale well.

Writing an automatic dup finder is much harder because many/most of
the initial stack traces are mostly useless (which is another bug we
need to track down at some point). 

Submitting those crashes is very expensive in terms of memory/CPU
usage for post-processing in the GUI, network bandwidth for
up/download, and Malone storage size. Although we warn about the
'private data' in the GUI, this is not really a decision that a
novice user can do appropriately, so we have the privacy problem as
well. In summary, we need a proper crash database for this.

There were differing opinions about the usefulness of crash reports
for stable releases (e. g. Seb feared the flood, Alexander rather
prefered to get reports). Can we please collect and weigh them
here?

My personal one: It is neither our policy nor do we have the resources
to fix a significant number of crashes in stable releases. When we
keep apport turned on in the development release, we should have
gotten reports about the more important ones already, and we have a
lot of fodder to grind through now in Malone. So I would prefer to
disable it by default.

These are the options we could do unintrusively:

(1) Flip /etc/default/apport to 'off'. Bug triagers can ask submitters
    to switch it on again when they can reproduce a crash and proved to
    be communicative.

(2) Keep apport itself enabled and have it stuff the dumps into
    /var/crash, but disable the automatic frontend invocation in
    update-notifier. This means a wasted processing overhead for
    the vast majority of the crashes that will happen out there, but
    the crash reports are retained, so that manually calling
    apport-{gtk,qt,cli} will continue to work as usual. We could even
    add a gconf key and a UI somewhere to re-enable it.

Slightly more work, but still doable:

(3) Create a blacklist or whitelist of executable names (no package
    names, please, expensive to find out!) which should continue to get
    automatic crash reports.

Thanks in advance for your feedback,

Martin

-- 
Martin Pitt        http://www.piware.de
Ubuntu Developer   http://www.ubuntu.com
Debian Developer   http://www.debian.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/ubuntu-devel/attachments/20070330/7feab98b/attachment-0001.pgp