[ubuntu-x] State of X for quantal

Bryce Harrington bryce at canonical.com
Fri Oct 5 18:23:44 UTC 2012


We're nearing the release date for 12.10.  The bug situation looks
pretty good quantitatively compared with previous cycles; we have just
over 100 bug reports needing some action or other by us, versus normally
several hundred at this point in the release cycle.

I'm going to be shifting focus away from bugs for a while to work on
some other projects, but wanted to highlight some particular problem
areas that exist, and what I think needs to be done.  I'd encourage
anyone with some time and interest to try tackling some of these
problems.  This stack is going into the 12.04.2 update, so the more
stable it is, the better received the update will be by our LTS users.

Obligatory graph:

  http://www.bryceharrington.org/Arsenal/ubuntu-x-swat/Reports/totals-quantal-workqueue.svg

Starting from the bottom of the graph and working up:


1.  -nouveau stabilization

Nouveau is in a rather funky state.  We're seeing an unusually high
number of reports about graphics corruption issues, and a fairly high
proportion of server crashes and gpu lockups.

I'm not really sure what the best strategy here is.  I've heard upstream
is in the midst of some significant rewriting of their code, and thus is
more unstable than usual.  Perhaps if someone could contact them they
could give us some guidance on how to achieve better stability.


2.  -intel GPU issues

The quantity of bug reports filed against -intel is a bit deceptive,
because we have integrated bug reporting tools that collect GPU lockups
automatically.  We just have way better data on Intel than the other
drivers.

So actually, despite -intel having the most bug reports of any of our
packages, the total is waaay lower than it usually is.

i.  A number of the bug reports are False GPU Lockups.  In general these
    are just minor nuisances, however bugs #1023691 and #1057188 see
    some more serious symptoms so would be worth investigating, and
    perhaps forwarding to Intel.

ii. Aside from those, we do have a number of legitimate GPU lockup bugs.
    What needs to be done with these is: a.  make sure there are
    reliable steps to reproduce the problem, b. have the user test
    Intel's mainline kernel
    (http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-experimental/)
    and then forward the bug upstream to bugs.freedesktop.org, with the
    dmesg and i915 gpu error file.

iii. There are also a few output or lid bugs; these are likely to be niche
    hardware specific bugs, and maybe need a quirk.


3.  xorg-server

It seems like we have a lot more Xserver crashes than usual.  However,
I think this is because of the improvements RAOF made to the apport
crash catching tool to catch crashes more reliably.

A number of the crashes appear to occur while trying to write error
messages to the screen or to log files.  Last cycle cnd discerned these
issues were due to signal-unsafe logging and a few patches were stuck
in, and one of our own suspected patches was removed; now either we need
more work here, or our assesment needs reevaluated.  This is probably a
hard problem but if it can be figured out it would solve a big pain
point - we're getting dupe bugs of these crashes sent in daily.

The next actions here are:  A) Evaluate all the stack traces for obvious
causes - null pointer derefs, corrupt memory, etc.  B) For any bug
reports that we know steps to reproduce, either reproduce ourselves
locally or forward these upstream, or both.  C)  For bugs we lack steps
to reproduce, push back to reporters to try and figure that out.  
D) Review discussions with cnd from last cycle (check ubuntu-x@ mail list
archives), and identify further next steps.


4.  nvidia

We actually appear to be in good shape here, as there's less than a
dozen bug reports.  However, there's one bug particular to 304.51 that
needs attention ASAP.  We may want to revert 304.51 until that gets
sorted.

Other than that, it would be good for someone to go through the
remaining bugs and see if they're real nvidia issues (some smell
mis-reported, others may need retested against 304.48), and any that
look reproducible should be flagged to tseliot to forward to NVIDIA.


5.  mesa

Mesa also has only about a dozen bug reports, but many of these
represent more serious problems.  Some are llvm-related; dropping
unity2d exposed some driver problems, not unexpectedly.  For these,
mainly we need a plan of attack identified.  Are they simple glitches
that can be fixed with a patch, or more extensive problems that will
require upstream development work to resolve?  Some of these problems
will be lower priority than others (like breakage on obscure hardware)
so kick stuff up to High priority if it looks important.

There's also a handful of compiz issues that should be reviewed.  Some
of these may not be mesa, and just needs isolated and re-filed.  For the
remainder, if they're reproducible in mesa 9, they should be forwarded
upstream and SRU'd if/when a patch becomes available.


6.  fglrx

Nearly all of these bug reports are invalid or at least need re-tested
with the recently uploaded 9.000.  Anything that's a reproducible fglrx
issue (or a crash with a good stack trace) that's been confirmed on
9.000 should be flagged to tseliot to forward to AMD.



If I had to pick just one of the above, #3 is most likely to give the
most bang for the buck.  We got good data on most of those crashes, and
many have been confirmed by two or more users.

The (very) good news is that nearly all the problems listed above are
either localized bugs or hardware-specific issues.  And just numerically
there are far fewer reports than usual (although I suspect many have
been sticking on 12.04 so perhaps we have fewer eyeballs than normal).
Even if we did no further bug work from now on, this release is already
a solid improvement over 12.04 for most all users.

Bryce



More information about the Ubuntu-x mailing list