Debugging tools/approach for GPU hangs?
mdz at canonical.com
Fri Sep 4 01:02:45 BST 2009
With more of the graphics stack moving into the kernel, we are starting to
see more bugs of this type:
Where the GPU is hung, but the system is otherwise still responsive. This
is annoyingly difficult to debug, with the primary technique being to ssh
into the system from a nearby one (because the console is useless).
I think it would be a worthwhile investment to work on improved tools and
methods for debugging this scenario, including:
* Detecting (programatically) when this situation occurs and capturing
an apport problem report, as described in
Bryce (and Jesse Barnes at Intel) mentioned that the kernel is now
supposed to log an error message when this happens, but I've never seen
evidence of that happening.
* Providing some means for the user to get the system into a debuggable
state, i.e. where they can see something on the screen. Maybe it's
possible to re-POST the video device to see if it gets back to a sane
* Documenting all of the above so that it can be easily executed by
reasonably technical users
More information about the ubuntu-devel