[ubuntu-x] (Update) Re: -nvidia upgrade issues
alberto.milone at canonical.com
Sat Nov 7 12:18:10 GMT 2009
On Saturday 07 Nov 2009 01:21:11 Bryce Harrington wrote:
> The two worst bugs are fixed, and the other two are at least understood
> now but I could use a bit more advice. It seems there is a weird race
> condition with DKMS/upstart/nvidia which has cropped up because due to
> faster boot, that looks tricky to get sorted, so feedback from people
> with experience in DKMS/upstart matters would be helpful.
> From what I understand, when doing an upgrade it installs both nvidia
> and a new kernel (2.6.31). At that point nvidia.ko is built against the
> *old* kernel (2.6.28). Fine, a nvidia.ko was successfully built so
> installation completes without error. xorg.conf is updated and the
> system is ready to run nvidia. Or so it thinks.
> Now the user reboots.
> During boot, dpkg notes that it needs to build a new nvidia.ko for
> 2.6.31 and dutifully gets to work. Meanwhile, since X is being started
> early on in the boot cycle, it in fact starts up before dkms has
> finished building the new nvidia.ko. X starts booting nvidia but since
> there is not yet an nvidia.ko for the current kernel it exits with an
> I'm going to see if I can reproduce this synthetically, but meanwhile
> does this theory make sense? If so, is there a dkms/upstart trick we
> could do to work around the issue in Karmic? And for Lucid what would
> the "right" solution be?
As far as I know, if the new kernel is installed after the nvidia package then
/etc/init.d/dkms_autoinstaller should kick in (and build the module for the
new kernel) and things should go well.
If, however, nvidia is installed (or updated) after installing the new kernel,
the kernel module will be built only for the kernel in use.
For this reason I think it makes sense to make sure that, when we install
nvidia (or any other driver which relies on DKMS), the module is built for the
current kernel (e.g. 2.6.28) and for the most recent one (e.g. 2.6.31). If the
current kernel and the most recent one are the same, then nothing changes and
we build only 1 module.
In my opinion, a clean implementation of this solution would involve doing
this in one file i.e. in the really handy template which DKMS provides in
/usr/lib/dkms/common.postinst and source it from the postinst of all the dkms-
based packages (as nvidia already does). This way - as opposed to rewriting
code in the postinst script of each dkms package - we can make sure that all
dkms packages are compliant with this new behaviour and reduce efforts in the
You can see how nvidia uses the dkms script here:
Would you accept a patch for the DKMS script to comply with this new
Better ideas are always welcome.
> > 438398 - If DKMS fails to build the kernel module, the package upgrade
> > does not kick out. It shows package upgrade as successful. So this
> > leads directly to...
> > 451305 - Jockey misses that the driver failed to build, and so is not
> > letting users know about the potential problem. It goes ahead and
> > updates xorg.conf as if the driver was there. X tries to obey the
> > configuration settings, but of course they won't work, so it exits on
> > startup with an error message. *Normally* bulletproof-X would kick in
> > at this point, display the error to the user, and give them some tools
> > to diagnose and/or debug the situation. Unfortunately...
I think I can fix this by making Jockey test the existence of the target
kernel module in
This can be done either in each handler (e.g. nvidia, fglrx, etc.) or in a
more generic way (maybe in the KernelModuleHandler class?) so that each
handler can benefit from this check.
any ideas on this?
Sustaining Engineer (system)
Canonical OEM Services
More information about the ubuntu-devel