Debugging tools/approach for GPU hangs?
Bryce Harrington
bryce at canonical.com
Fri Sep 4 10:25:45 BST 2009
On Thu, Sep 03, 2009 at 05:02:45PM -0700, Matt Zimmerman wrote:
> With more of the graphics stack moving into the kernel, we are starting to
> see more bugs of this type:
>
> http://launchpad.net/bugs/359392
> http://launchpad.net/bugs/388357
> http://launchpad.net/bugs/424055
>
> Where the GPU is hung, but the system is otherwise still responsive. This
> is annoyingly difficult to debug, with the primary technique being to ssh
> into the system from a nearby one (because the console is useless).
Actually there have been GPU hang bugs for a long time. It's just that
they wasn't a way to debug them until recently.
> I think it would be a worthwhile investment to work on improved tools and
> methods for debugging this scenario, including:
>
> * Detecting (programatically) when this situation occurs and capturing
> an apport problem report, as described in
> http://mdzlog.alcor.net/2009/06/17/collecting-debug-information-when-your-gpu-hangs/
>
> Bryce (and Jesse Barnes at Intel) mentioned that the kernel is now
> supposed to log an error message when this happens, but I've never seen
> evidence of that happening.
I'm cc'ing jbarnes here. Last I heard this was implemented upstream but
hadn't yet filtered down.
> * Providing some means for the user to get the system into a debuggable
> state, i.e. where they can see something on the screen. Maybe it's
> possible to re-POST the video device to see if it gets back to a sane
> state?
>
> * Documenting all of the above so that it can be easily executed by
> reasonably technical users
Bryce
More information about the ubuntu-devel
mailing list