Analysis of 10 years of bugzilla.mozilla.org

Tue Jan 26 21:33:53 GMT 2010

Christopher Armstrong <radix at twistedmatrix.com> writes:
>Question every graph. Correlation is not causation. Maybe bugs getting
>75 comments is caused by something *else*, and that *other* thing is
>also causing the bug to be hard to verify. For example, a race
>condition bug would be hard to verify, *and* random changes to
>environments tend to "fix" it, so you'd get lots of confused
>commenters claiming they've found fixes and also claiming that
>previous fixes don't work. So in this case, I would not say the
>commenters made the bug hard to verify, but the race-condition nature
>of the bug did.

Yeah, what Christopher said. People tend to start chiming in precisely
when the cause or fix for the bug is not immediately clear, or (say)
when there is disagreement about whether the bug is really a bug or not
-- another circumstance that may cause the bug not to be marked as
"verified" for a while.

For Ubuntu, some questions we might ask are:

  - What percentage of bugs are dups?  Does dup filing correlate with karma?

  - Any correlation between karma and tendency-of-reported-bugs-to-close?

  - Any correlation between tendency-of-past-reported-bugs-to-close and
    tendency-of-bugs-reported-after-that-to-close?  (I.e., This is a
    proxy for the "Are some people better reporters than others, and if
    so can we identify them?" question.)

But it's important to have a use case first -- knowing what we're going
to *do* with information, before we gather it.  So we should start by
asking bug triagers and developers what information they think would
help them, and work backwards from there.  While I theorize that the
above three stats would be interesting, I could easily be wrong.

-K, taking a breather from a bizarre behavior in his bug #506018 branch