Debugging tools/approach for GPU hangs?
Jesse Barnes
jesse.barnes at intel.com
Wed Sep 23 21:27:09 BST 2009
On Tue, 22 Sep 2009 09:41:45 -0700
Matt Zimmerman <mdz at canonical.com> wrote:
> On Tue, Sep 15, 2009 at 04:10:45PM -0700, Jesse Barnes wrote:
> > On Sun, 6 Sep 2009 00:23:01 -0700
> > Steve Langasek <steve.langasek at ubuntu.com> wrote:
> > > Why would we not want to pull these for karmic? Is there a
> > > significant risk of regressions with these patches?
> > >
> > > If the only problem is that they don't always work, that's still
> > > better than where we are now, surely.
> >
> > FYI, Ben Gamari posted an updated version of the reset patchset. It
> > seems to work reliably now, so you might want to pick it up at some
> > point. It generates uevents when a reset happens, so you can
> > further track GPU hangs and bugs.
>
> I had a look at this recently and couldn't quite figure out how to
> match the uevent. The relevant code seems to be in
> drivers/gpu/drm/i915/i915_irq.c:i915_capture_error_state et al, but
> it's not obvious how to match that kobject in a udev rule. Can you
> give me a hint?
You should get a uevent from the i915 drm device (udevadm will show
hotplug events when you plug/unplug VGA; you can use them as an
example).
You'll get three events, one when the error is detected, one before the
reset and one after. Each has a different environment variable set;
the initial error has ERROR=1, the pre-reset event has RESET=1 and the
post-reset event has ERROR=0.
Does that help?
Jesse
More information about the ubuntu-devel
mailing list