extending apport/problem_report format?

Thu Sep 23 21:38:58 UTC 2010

On Wed, Sep 22, 2010 at 6:24 AM, Martin Pitt <martin.pitt at ubuntu.com> wrote:
> Hello Edwin,
>
> Edwin Grubbs [2010-09-16 17:47 -0500]:
>> Launchpad.net is looking into whether to use the problem_report python
>> module to store website errors or even to use the apport python module
>> to help collect system data for the problem report. Currently, each
>> exception is stored in a separate "oops" file with a bunch of extra
>> data, such as the cgi request variables, and it is formatted like an
>> rfc822 email message to take advantage of modules for formatting and
>> parsing.
>
> That indeed is what Apport .crash reports have as well.
>
>> The oops-tools project, which analyzes and displays the oops files in
>> a web page, is planned to be open sourced soon. Therefore, I have two
>> main questions.
>> 1. Is there interest in having the problem_report format be extended
>> to handle more complex data structures that will be parsed and
>> analyzed by a tool such as oops-tools?
>
> Not from my side. So far we got along well with just having a
> single-layer dictionary. The convention for lists as values is to have
> one element per line, e. g.:
>
> Dependencies:
>  libfoo1
>  libbar2
>
> Can you point out an example what else you need?

Currently, the oops contains a list of all the sql queries that
occurred during a page request. Each item in the list will also
include the start and stop time relative to the beginning of the
request and the db user. Query parameters aren't included, but we're
planning on adding that. We might switch it to a list of dictionaries
formatted as JSON, which will make it easier to extend in future. The
oops-tool displays a page with a separate section for the five slowest
queries and for the queries that repeat.

>> 2. Would apport be interested in receiving other features of
>> oops-tools, such as the django based web interface for viewing oopses?
>
> Is this read-only, or can you also update the data there? We have used
> Launchpad Bugs as a "crash database" backend so far, because a bug
> tracker provides us all the functionaly that we need, except that it's
> sometimes hard to tell apart crashes and regular bugs, for getting a
> clean view for triagers.
>
> It sounds like an interesting option, though, if it can represent the
> structure of Ubuntu, like distros/packages/package versions, etc.

The web UI is read-only. Besides displaying data related to an error,
it groups oopses by the their page/error, and it sends out email
reports about which pages are the worst offenders. I'm actually not
trying to convince you to use oops-tools. I'm just trying to figure
out what the relationship should be between these two projects that
have a lot of overlap in their scopes. I don't want to add features
just to oops-tools, if that work would be beneficial to apport.

>> The second question is probably hard to answer right now, so I'll
>> focus on the limitations of the problem_report format that we would
>> either extend in a wrapper class or in problem_report itself.
>>
>> * problem_report doesn't provide a standard format for complex data.
>
> Right, it currently uses standard RFC822, which doesn't define any
> more complex data types.
>
>> Even adding another level of name/value pairs inside a field is not
>> well supported, since you have to use a StringIO object to get the
>> data from ProblemReport object to put it in a field of another
>> ProblemReport. Lists of dictionaries would also require their own
>> format. Here is an example of recursive ProblemReports.
>
> This works fine if you hardcode assumptions about the syntax of
> particular field names, which we generally have to for such
> post-processing scripts anyway.

I would like to avoid hardcoding assumptions, since it makes it more
complicated to change the format of a field in the future, since the
format will have to be introspected.

> But if we need complex data structures, then I'd rather use a standard
> format like JSON for this, as you suggested.
>
> The problem_report module is not conceptually limited to RFC822 only.
> For example, it also has the ability to output its data Multipart/MIME
> format (for uploading data to Launchpad). So it wouldn't be a problem
> at all to add reading/writing JSON.
>
> However, the module currently _is_ conceptually limited to a single
> level dictionary structure, since API users can (and do) pretty much
> treat it as a dictionary with extra features, and can currently rely
> on the data types of the values (strings). We could allow more, and
> then just fix the existing write() and write_mime() to throw an
> exception if they encounter an unrepresentable data type; this would
> mean you could never upload such a report to Launchpad bugs.
>
>> * problem_report only allows field names to contain letters, numbers,
>> ".", "_", and "-". That could cause problems when dumping a bunch of
>> name/value pairs from an application in order to analyze it later.
>
> That's not a problem in Apport and package hooks, since (as pointed
> out before) the set of key names is pretty much static. In the cases
> where it isn't, hookutils provides a helper for cleaning up key names.
> I'd like to avoid arbitrary strings here, since it can easily lead to
> problems, break the RFC822 format, or cause unexpected errors in
> scripts which process those reports.
>
>> * problem_report really supports text or compressed text files. There
>> is no ability to specify a content-type even when using
>> problem_report's write_mime() method.
>
> In general we know what content type a field has. If not, then you
> could always specify it in another field, like:
>
> Data: blob0xDEADBEEF
> DataType: image/jpeg
>
> ?

Since MIME already expects Content-Type metadata, it seems strange to
store it in a separate field.

>> * The write_mime() method even encodes the single-line name/value
>> pairs as base64, so it is not at all human readable.
>
> Only if it's longer than 5 lines or has non-ASCII characters,
> otherwise it lands in the "short values" text section (where it is
> readable).

I don't know if I'm just not aware of a parameter, but all the short
values are base64 encoded when I run write_mime(), even if the only
fields are ProblemType and Date, which were autogenerated.

> But why do you care? This format is supposed to be nothing more than a
> transport vehicle from client computers to Launchpad. It's not really
> supposed to be looked at by humans.

Well, the reason I was looking at mime as something besides just a
transport vehicle is because it already has a defined way to specify
the Content-Type. Although we usually view oopses through the oops
tool, we sometimes view them on disk. In that case, there is some
benefit to having all the short values at the top of the file, where
they can be perused easily, like normal RFC822 headers. Then, the the
really large fields can be placed at the bottom like mime attachments,
which makes them less likely to obscure the view of the small fields.

We really don't get any benefit from problem_report providing a method
to safely transport the info via email. We just want a human-readable
format that we can easily parse so it can be analyzed. In which case,
we should probably focus our discussion on ProblemReport.write()
instead of write_mime().

In some ways, I am trying to future-proof our choice of format,
instead of focusing on just the immediate needs, which is why mime
seemed so attractive. apport is still attractive to us since it
provides a lot of methods for collecting system data.

Do any of the features we've talked about sound like something you
would like added to apport/problem_report? If not, I'll just take the
discussion back to our team about using a separate field to store the
Content-Type and using something like JSON for complex data or for
name/value pairs whose names contain characters that apport doesn't
allow, and whose names we don't want to have cleaned.

-Edwin