Analysis of 10 years of

Bryce Harrington bryce at
Tue Jan 26 22:57:55 GMT 2010

On Tue, Jan 26, 2010 at 04:33:53PM -0500, Karl Fogel wrote:
> Christopher Armstrong <radix at> writes:
> >Question every graph. Correlation is not causation. Maybe bugs getting
> >75 comments is caused by something *else*, and that *other* thing is
> >also causing the bug to be hard to verify. For example, a race
> >condition bug would be hard to verify, *and* random changes to
> >environments tend to "fix" it, so you'd get lots of confused
> >commenters claiming they've found fixes and also claiming that
> >previous fixes don't work. So in this case, I would not say the
> >commenters made the bug hard to verify, but the race-condition nature
> >of the bug did.

Philosophically I see what you're getting at, but in practice my
experience has been what I said.  Given a race condition bug (or an X
freeze, or other hard-to-debug problem), it's hard enough to figure it
out with just one or two bug reporters; when dozens of random people
start tossing in their findings/ideas/complaints you end up wasting time
due to reading endless "me-too" comments, chasing false leads, being
exasperated at rude commenters...  But I've already blogged out all my
thoughts here (

> Yeah, what Christopher said. People tend to start chiming in precisely
> when the cause or fix for the bug is not immediately clear, or (say)

Heh, as a counter example I would point to any "shed painting" exercise.
You have a clear bug with a simple solution, like say, "Make
ctrl-alt-bkspace not terminate X".  Literally a 10 minute bug to turn it
off.  Now add 75 commenters, each with their own differing idea on how
it should be solved, whether it should be solved, how to make it
configurable, etc. etc.  Your CEO even chimes in, and that only serves
to set off *another* 75 commenters...  You end up spending hours writing
specifications, implementing configuration tools, answering questions,
discussing it in meetings...  And then you spend 10 minutes and turn it

On the other hand, I would agree that there can be a correlation between
"negative commentation" and "bug report quality".  Low quality bug
reports often are rather ambiguous about the symptoms, steps to
reproduce, and so on, which can mislead other people into thinking they
have the same issue and adding irrelevant comments.

> For Ubuntu, some questions we might ask are:
>   - What percentage of bugs are dups?  Does dup filing correlate with karma?
>   - Any correlation between karma and tendency-of-reported-bugs-to-close?
>   - Any correlation between tendency-of-past-reported-bugs-to-close and
>     tendency-of-bugs-reported-after-that-to-close?  (I.e., This is a
>     proxy for the "Are some people better reporters than others, and if
>     so can we identify them?" question.)

Those are good questions.  Some other related questions which might be

   - Correlation with description size and tendency-of-bugs-to-close?

   - Of bugs closed, is there a correlation between number of comments
     and whether it is more likely to be closed as invalid, fixed, or

> But it's important to have a use case first -- knowing what we're going
> to *do* with information, before we gather it.  So we should start by
> asking bug triagers and developers what information they think would
> help them, and work backwards from there.  While I theorize that the
> above three stats would be interesting, I could easily be wrong.

Definitely true.  I already have a pretty firm idea that high-karma will
have a good correlation to high-quality-bug-reports, and that
high-quality-bug-reports correlate with issues that can be solved fairly
readily (or at least will be worth the time to analyze since the user is
probably responsive and at least moderately technical).

Identifying "better than average bug reporters" would be an interesting
metric, but probably not a useful one.  If we used it to prioritize what
bugs to work on, it would unbalance things - people who tend to get bugs
fixed for them would get even more fixed, and those that tend to have
their bugs ignored in the past would be even more ignored in the future.

Regarding dupe bugs, I don't care much - actually I encourage people to
file dupe bugs because launchpad makes it cheap to handle dupes.

For description size, I have a feeling that you can get a really good
correlation between bug report quality and bugs that have at least a few
sentences in their description.

Basically, anything that'd help us estimate some sort of "quality score"
for a bug would be helpful, as it'd allow developers to focus their time
on the highest quality bug reports, and allow triagers to focus on ways
of improving the quality of low/medium quality bug reports.  It also
would give a nice feedback loop to bug reporters to encourage making
good bug reports.


More information about the ubuntu-devel mailing list