NVidia RTX 3060 and 21.04
EdLesMann
edlesmann at gmail.com
Mon Apr 12 02:20:16 UTC 2021
On 4/10/21 10:38 PM, chris hermansen wrote:
> Hello again everyone,
>
> Hoping to beg a bit of advice. I have a brand new computer with an
NVidida
> RTX 3060 and I'm running the daily 21.04 on it. Install goes fine but
> "after awhile" I seem to end up with a manually installed driver for the
> RTX and "after a bit more while" all of a sudden I'm looking at a 1320 X
> 768 display.
>
> I've made this problem go away temporarily with
>
> sudo apt purge nvidia-\*
>
> and poof I'm back to a reasonable 1920 x 1080 display.
>
> As far as I know I did not "manually install" anything. I'm a bit
lost as
> to how to report this as a bug, given that it's an NVidia thing and also
> this weird "manually installed" package.
>
> If anyone has any suggestions I'd be most grateful.
>
Greetings,
It's been a long time since I've used Nvidia on Ubuntu. So I'm going to
start with a package I know. Hopefully I don't simplify my explanation
to the point of being incorrect. :-D
Linux-generic is a "package" that really isn't a package. It's more like
a pointer to the latest version of the kernel. If you never flush out
old kernels when you update, eventually (I think it's three old
versions??) you will find that the repo no longer lists the kernels as
available and apt will then basically say "I can't find this package
anymore in the source list so therefore this package must have been
installed manually." And that's how you can end up with "manually"
installed packages that were actually installed by Ubuntu - it happens
all the time.
My *guess* is that you have an nvidia pointer package that is installing
drivers for you. Eventually the version you are running is falling out
of sync with the upstream source and thus your specific version is
falling into "I can't find it anymore so it must be a manual install".
Also, *IF* I remember correctly NVidia drivers are built to a specific
version of the kernel but can be "loosely" linked. Thus, an NVidia
driver for kernel 5.4.0.66.68 could work on 5.4.0.66.69 without
recompile provided that nothing major changed in the kernel specific to
its hooks. But eventually, you will update to a version of the kernel
that isn't loosely linked and if the nvidia driver isn't automatically
rebuilt with the update then things are going to go very poorly in
driver quality.
Here's my suspicion. If removing the NVidia drivers solves your problem,
then you are wanting to stay on the nouveau drivers. Some update some
where is telling your system to update to the NVidia drivers. They get
out of sync and those drivers are then listed as "manual". Eventually
those drivers clash/conflict with the latest kernel update and you end
up with a bad display. I most often see this kind of thing happen with
auto-updates in the background that pick the "best" option for you.
[quick side rant]
Telling someone to disable auto-updates is **TERRIBLE** advise. But this
non-sense about drivers getting out of line is _precisely_ why I disable
auto-update. I _loath_ it when I leave a perfectly functional system one
night and log in the next morning to a busted upgrade... But I'm also
paranoid and responsible enough that I subscribe to all the RSS feeds
for all the security notices on my systems and I have planned times when
I apply updates... But because the vast majority of users can't be
bothered to actually do regular updates, then updates have to be forced
on them automatically to prevent massive security issues and evil
botnets which unfortunately means user systems "break unexpectedly".
*sigh*
[end rant]
So what should you do? If you really don't want the NVidia drivers, then
open up "software sources" and look for the "additional drivers" tab.
Make sure that NVidia is disabled here. If you *DO* want the NVidia
drivers, then make sure the appropriate driver is selected (if I recall
correctly, there's like a stable, beta, and maybe something else?). And
if you can't find "software sources", drop to the command line and try
typing `software-properties` then hit the tab key to see if there are
auto-completes for qt or gtk or whatever. It *should* be installed
already if you did a default Ubuntu install though.
From that same interface, on a different tab, you can also check to see
what updates happen, and at what time interval updates are checked for
and applied. I recommend you at least knowing what frequency this might
be occurring on your system.
Also, Ubuntu can churn through kernels. I recommend staying on top of
any automated process that is updating behind the scenes. Knowing when
and how often your computer is updating the kernel and then verifying if
it is installing the nvidia drivers along side it might help you narrow
down the who. Once you find the who - then you can file a bug report.
It's been too long since I've really poked at the auto-update process
for Ubuntu (besides just flat out turning it off!). I honestly don't
remember the package/process name that does the auto update. Maybe
someone else can tell you exactly the package/process causing a problem.
But I can tell you how I'd start looking for it.
Start with /var/log/dpkg.log and look for clues as to when and what
packages are being installed. For example, lets say you notice a line
that looks something like this (this is fabricated so it may not match
exactly - in fact I'm making up version numbers! :-D ):
2021-04-11 20:40:57 status installed nvidia-driver:all 5.1.2.23
Great! You know that the package was installed at that time. Next take a
look at the file /var/log/syslog to find out what was going on at that
time. Perhaps you might see something that looks similar to this (again,
making up a few numbers but it should be close):
Apr 11 20:34:56 cohen systemd[1]: Started Run anacron jobs.
Apr 11 20:34:56 cohen anacron[1854232]: Anacron 2.3 started on 2021-04-11
Apr 11 20:47:12 cohen systemd[1]: anacron.service: Succeeded.
Apr 11 20:47:12 cohen anacron[1854232]: Normal exit (1 jobs run)
Great! That tells you that anacron had some job that ran for ~13 minutes
which is probably a good indicator that it busy for a while and that
aligns well to a package upgrade. Maybe it isn't anacron. Maybe it is
/etc/cron.hourly or /ect/cron.daily. But it is probably some cron
service running the script to auto update. At that point, you just track
down what application is installing the nvidia drivers. Take a look in
the cron service to see which looks closest to updating packages.
Bonus! If you can actually catch all the logs of this happening then it
helps the bug report. It is much more helpful if you can say "This
system was working perfectly, then this update with these syslog files
installed these nvidia packages from dpkg.log which then broke my
system. I then removed nvidia to make my system work again."
Hope that helps some. And I hope you find the package that is causing
problems so that someone who knows what they are talking about can fix
it! (because if it is causing you problems it's probably causing someone
else problems too!)
:-D
Ed
More information about the Ubuntu-quality
mailing list