Debugging tools/approach for GPU hangs?

Jesse Barnes jesse.barnes at intel.com
Wed Sep 23 21:27:09 BST 2009


On Tue, 22 Sep 2009 09:41:45 -0700
Matt Zimmerman <mdz at canonical.com> wrote:

> On Tue, Sep 15, 2009 at 04:10:45PM -0700, Jesse Barnes wrote:
> > On Sun, 6 Sep 2009 00:23:01 -0700
> > Steve Langasek <steve.langasek at ubuntu.com> wrote:
> > > Why would we not want to pull these for karmic?  Is there a
> > > significant risk of regressions with these patches?
> > > 
> > > If the only problem is that they don't always work, that's still
> > > better than where we are now, surely.
> > 
> > FYI, Ben Gamari posted an updated version of the reset patchset.  It
> > seems to work reliably now, so you might want to pick it up at some
> > point.  It generates uevents when a reset happens, so you can
> > further track GPU hangs and bugs.
> 
> I had a look at this recently and couldn't quite figure out how to
> match the uevent.  The relevant code seems to be in
> drivers/gpu/drm/i915/i915_irq.c:i915_capture_error_state et al, but
> it's not obvious how to match that kobject in a udev rule.  Can you
> give me a hint?

You should get a uevent from the i915 drm device (udevadm will show
hotplug events when you plug/unplug VGA; you can use them as an
example).

You'll get three events, one when the error is detected, one before the
reset and one after.  Each has a different environment variable set;
the initial error has ERROR=1, the pre-reset event has RESET=1 and the
post-reset event has ERROR=0.

Does that help?

Jesse



More information about the ubuntu-devel mailing list