Debugging KMS (and thoughts thereof)

Bryce Harrington bryce at
Wed Mar 17 01:44:35 UTC 2010

As we move into the new world of KMS, I'm noticing that a lot of our
established procedures and tools for debugging modesetting problems no
longer work.

I'd devoted a good bit of study into this class of problems under UMS.
It's a bit of a shame to lose this with KMS, and unfortunately I'll be
tied up on other efforts the remainder of the year so can't help in
transitioning the techniques to work with the kernel.

But, I thought I'd do a brain dump here in email, and hope others can
carry forward in developing the necessary kernel code and documenting
the procedures.

EDID Quirks
Monitor manufacturers sometimes mess up the EDID that's encoded in the
monitors, which screws up the mode algorithms in some weird but
recognizable ways.

Symptoms include incorrect resolution selection, clipped displays, poor
sync lock, chopped up displays and so on.

The analysis techniques for this class of bug involves looking at the
EDID.  I learned just today that sysfs exposes the EDID binary blob,
which is great - make sure to teach the X and kernel apport scripts and
the arsenal triaging scripts to collect this file.

Since it's a binary file, you can't read it directly, which sucks.
parse-edid will show its contents but doesn't give what you need to
examine these issues.  I think we need a new tool here, which does a
hexdump on the binary (or which extracts it from the Xorg.0.log) and
displays the fields according to the spec, so it's clearer what is
present there.  Wikipedia has a copy of the EDID spec:

You can see a list of the known quirks in drm_edid.c in the kernel.
I've always thought it should be feasible to create an algorithm which
examines edids for situations where a quirk might apply, but I've never
looked deeply into this.

Anyway, beyond being able to look at the EDID, you also need to have
easy ways for users to test out possible quirks.  In UMS, there are
snippets people can insert into xorg.conf that does the equivalent of
what the quirks do, so we can have them test ideas and definitively
prove what quirk is needed.  Patching the quirk in then becomes really
straightforward.  See for examples of
how we walk the user through doing this.

AFAIK there isn't a way to tell the kernel to turn particular quirks on
or off, or to otherwise configure around the defaults.  So this is
needed.  I'm not sure how it should work for the kernel, but presumably
some kernel command line flag.

Phantom Outputs
Another common problem which (fortunately) I've not seen with KMS is
resolution problems due to having some outputs on that shouldn't be.
Maybe this is a non-issue with KMS, but I'm not sure.

The analysis technique here is to shut off the output with xrandr, and
if that works it can be quirked off.  Maybe this procedure will work
identically with KMS, but it should be tested and documented.

Resolution Selection Fallback Issues
Sometimes the issue is that some problem is causing the modesetting
logic to fail to find a suitable resolution and it goes into various
fallbacks.  These sometimes make unsatisfactory choices in the hope of
picking a safe default.  Usually we can spot this going on by looking at
the Xorg.0.log, where it prints out information about each resolution
it's considering and what timings it sees, and why it didn't pick it.

This same thing must be going on in the kernel, but I don't see anything
in dmesg or syslog about it, and I don't see much in the way of printk's
in the edid drm code.  You may find it useful to add some verbosity in
here so that there's more details in log files about why it is not
picking resolutions.

Manual Configuration of Resolutions
99.9% of the time having X autodetect the resolution (and quirking when
it doesn't) is the best way to go.  But there are still use cases that
simply cannot be solved with autodetection, and it seems like a large
oversight to have not provided some override or manual configuration
mechanism with the kernel.

As I mentioned in an earlier email, one place such a mechanism could be
hooked in is drm_helper_probe_single_connector_modes()

There are three different approaches for manual resolution
configuration that have been used historically with xorg.conf:

The first, and easiest, is to simply list a resolution (e.g. 1280x800 or
1280x800 at 55) in the config file.  This clues the algorithms into
selecting that resolution if at all possible.

The second involves specifying H/V sync values.  This works around
problems where the EDID is wrong, or where the modesetting algorithms
are making improper guesses of Horiz and Vert sync rates.  Usually in
this case you'll also need to manually specify the resolution(s) too.
But this is all info that comes in your monitor's manual so is not too
hard for end users to gather.  I think this takes care of cases like
where the KVM is filtering out the EDID, or where the EDID is completely
corrupt for some reason.

The third involves manually specified modelines.  This is hard core, but
necessary in some situations for working around problems.  I think this
can still be done with KMS using xrandr, but that may be of little help
if you can't get X started to begin with, and of course this isn't a
persistent configuration solution; you'd have to re-run xrandr during
startup each time and that seems suboptimal.

Another idea, which got talked about but afaik never implemented, is to
allow the user to put in an EDID override blob for the modesetting
system to use.  It's questionable if this would really be better than
patching around things in the kernel, however I could see this being a
popular option amongst OEMs who might prefer not having to do kernel
rebuilds, or for KVM users who could capture the monitor's EDID in
non-KVM situations and then put that in place so things work properly
when the KVM is being used.

Anyway, I hope all this info may be of some use.  I wish we had time to
implement all this for Lucid, but I guess this is one of the prices for
switching to KMS now instead of 6 months from now.  I haven't checked if
any of the above is already being implemented upstream; I hope it is,
but if not, I do think it would be well worth the effort for us to do


More information about the kernel-team mailing list