NVidia RTX 3060 and 21.04

EdLesMann edlesmann at gmail.com
Mon Apr 12 02:20:16 UTC 2021


On 4/10/21 10:38 PM, chris hermansen wrote:
 > Hello again everyone,
 >
 > Hoping to beg a bit of advice.  I have a brand new computer with an 
NVidida
 > RTX 3060 and I'm running the daily 21.04 on it.  Install goes fine but
 > "after awhile" I seem to end up with a manually installed driver for the
 > RTX and "after a bit more while" all of a sudden I'm looking at a 1320 X
 > 768 display.
 >
 > I've made this problem go away temporarily with
 >
 > sudo apt purge nvidia-\*
 >
 > and poof I'm back to a reasonable 1920 x 1080 display.
 >
 > As far as I know I did not "manually install" anything.  I'm a bit 
lost as
 > to how to report this as a bug, given that it's an NVidia thing and also
 > this weird "manually installed" package.
 >
 > If anyone has any suggestions I'd be most grateful.
 >



Greetings,

It's been a long time since I've used Nvidia on Ubuntu. So I'm going to 
start with a package I know. Hopefully I don't simplify my explanation 
to the point of being incorrect. :-D

Linux-generic is a "package" that really isn't a package. It's more like 
a pointer to the latest version of the kernel. If you never flush out 
old kernels when you update, eventually (I think it's three old 
versions??) you will find that the repo no longer lists the kernels as 
available and apt will then basically say "I can't find this package 
anymore in the source list so therefore this package must have been 
installed manually." And that's how you can end up with "manually" 
installed packages that were actually installed by Ubuntu - it happens 
all the time.

My *guess* is that you have an nvidia pointer package that is installing 
drivers for you. Eventually the version you are running is falling out 
of sync with the upstream source and thus your specific version is 
falling into "I can't find it anymore so it must be a manual install".

Also, *IF* I remember correctly NVidia drivers are built to a specific 
version of the kernel but can be "loosely" linked. Thus, an NVidia 
driver for kernel 5.4.0.66.68 could work on 5.4.0.66.69 without 
recompile provided that nothing major changed in the kernel specific to 
its hooks. But eventually, you will update to a version of the kernel 
that isn't loosely linked and if the nvidia driver isn't automatically 
rebuilt with the update then things are going to go very poorly in 
driver quality.

Here's my suspicion. If removing the NVidia drivers solves your problem, 
then you are wanting to stay on the nouveau drivers. Some update some 
where is telling your system to update to the NVidia drivers. They get 
out of sync and those drivers are then listed as "manual". Eventually 
those drivers clash/conflict with the latest kernel update and you end 
up with a bad display. I most often see this kind of thing happen with 
auto-updates in the background that pick the "best" option for you.

[quick side rant]
Telling someone to disable auto-updates is **TERRIBLE** advise. But this 
non-sense about drivers getting out of line is _precisely_ why I disable 
auto-update. I _loath_ it when I leave a perfectly functional system one 
night and log in the next morning to a busted upgrade... But I'm also 
paranoid and responsible enough that I subscribe to all the RSS feeds 
for all the security notices on my systems and I have planned times when 
I apply updates... But because the vast majority of users can't be 
bothered to actually do regular updates, then updates have to be forced 
on them automatically to prevent massive security issues and evil 
botnets which unfortunately means user systems "break unexpectedly".
*sigh*
[end rant]

So what should you do? If you really don't want the NVidia drivers, then 
open up "software sources" and look for the "additional drivers" tab. 
Make sure that NVidia is disabled here. If you *DO* want the NVidia 
drivers, then make sure the appropriate driver is selected (if I recall 
correctly, there's like a stable, beta, and maybe something else?). And 
if you can't find "software sources", drop to the command line and try 
typing `software-properties` then hit the tab key to see if there are 
auto-completes for qt or gtk or whatever. It *should* be installed 
already if you did a default Ubuntu install though.

 From that same interface, on a different tab, you can also check to see 
what updates happen, and at what time interval updates are checked for 
and applied. I recommend you at least knowing what frequency this might 
be occurring on your system.

Also, Ubuntu can churn through kernels. I recommend staying on top of 
any automated process that is updating behind the scenes. Knowing when 
and how often your computer is updating the kernel and then verifying if 
it is installing the nvidia drivers along side it might help you narrow 
down the who. Once you find the who - then you can file a bug report.

It's been too long since I've really poked at the auto-update process 
for Ubuntu (besides just flat out turning it off!). I honestly don't 
remember the package/process name that does the auto update. Maybe 
someone else can tell you exactly the package/process causing a problem. 
But I can tell you how I'd start looking for it.

Start with /var/log/dpkg.log and look for clues as to when and what 
packages are being installed. For example, lets say you notice a line 
that looks something like this (this is fabricated so it may not match 
exactly - in fact I'm making up version numbers! :-D ):

2021-04-11 20:40:57 status installed nvidia-driver:all 5.1.2.23

Great! You know that the package was installed at that time. Next take a 
look at the file /var/log/syslog to find out what was going on at that 
time. Perhaps you might see something that looks similar to this (again, 
making up a few numbers but it should be close):

Apr 11 20:34:56 cohen systemd[1]: Started Run anacron jobs.
Apr 11 20:34:56 cohen anacron[1854232]: Anacron 2.3 started on 2021-04-11
Apr 11 20:47:12 cohen systemd[1]: anacron.service: Succeeded.
Apr 11 20:47:12 cohen anacron[1854232]: Normal exit (1 jobs run)

Great! That tells you that anacron had some job that ran for ~13 minutes 
which is probably a good indicator that it busy for a while and that 
aligns well to a package upgrade. Maybe it isn't anacron. Maybe it is 
/etc/cron.hourly or /ect/cron.daily. But it is probably some cron 
service running the script to auto update. At that point, you just track 
down what application is installing the nvidia drivers. Take a look in 
the cron service to see which looks closest to updating packages.

Bonus! If you can actually catch all the logs of this happening then it 
helps the bug report. It is much more helpful if you can say "This 
system was working perfectly, then this update with these syslog files 
installed these nvidia packages from dpkg.log which then broke my 
system. I then removed nvidia to make my system work again."

Hope that helps some. And I hope you find the package that is causing 
problems so that someone who knows what they are talking about can fix 
it! (because if it is causing you problems it's probably causing someone 
else problems too!)
:-D

Ed





More information about the Ubuntu-quality mailing list