[Bug 624877] Re: INFO: task dpkg:23317 blocked for more than 120 seconds.
Theodore Ts'o
tytso at mit.edu
Thu Sep 13 21:33:47 UTC 2012
There are two issues here, that interact and so they are confusing
people. The first is that the kernel has a potential livelock problem
in the writeback code, such that if there are constantly new pages
dirtied that requires writeback, the sync(2) system call will never
return (at least until all of the pages are clean, but on a busy system
with lots of processes writing to the disk that could never happen).
It doesn't happen all of the time sync(2) is called, but since dpkg was
calling sync(2) all the time, it tended to happen there. Still, this
problem can happen without dpkg being involved at all, and on many
different file systems, since it's a problem with the generic writeback
code. Trying to backport this fix to the ancient kernel which is in
10.04 is going to be _hard_. There are people at Red Hat who are paid
the big bucks to do this kind of painful backporting (which in this case
is multiple patches spread across multiple kernel releases before it was
finally fixed, and with all sorts of dependencies). Good luck finding
a volunteer willing to figure this out. I wouldn't --- I would much
rather run a 3.x kernel. And if I had a business that needed to use a
stable enterprise kernel, I'd pay the darned Red Hat or SLES support
fees, and get a professionally managed enterprise kernel.
Unfortunately, in my experience Canonical doesn't have paid kernel
engineers who have either the skill or the bandwidth (not sure which) to
do this kind of very tricky backporting to ancient LTS kernels, as
compared to what Red Hat has done. I've seen this with ext4 bug fixes
which don't get made to 10.04, but which Red Hat has been willing to do
for their RHEL6 kernel.
Note that this problem is much less likely to hit on desktop/laptop
systems where there generally aren't servers continuously writing to the
file system. So for most Ubuntu systems that tend not to be production
servers running with highly stressful workloads, this won't be an issue.
The people who are complaining on this Launchpad bug are probably
outliers, which probably explains the priority paid Canonical engineers
have towards doing this kind of backporting.
The second problem/bugfix is the fix to dpkg, which significantly
improves both its performance, and the impact on the system as a whole,
by using sync_file_range() instead of sync(). Fixing this also tends
to remove one of the more common ways of tickling the bug above, but
that's not the only reason why backporting this dpkg package would also
be a good idea, since it speeds up and decreases the overall system
impact of doing package installs.
Or, people could just upgrade their system to Ubuntu LTS 12.04.....
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to dpkg in Ubuntu.
https://bugs.launchpad.net/bugs/624877
Title:
INFO: task dpkg:23317 blocked for more than 120 seconds.
Status in The Linux Kernel:
Expired
Status in “dpkg” package in Ubuntu:
Fix Released
Status in “linux” package in Ubuntu:
Confirmed
Status in “dpkg” source package in Lucid:
Triaged
Status in “dpkg” package in Debian:
Fix Released
Bug description:
[Impact]
[Fix]
[Test Case]
[Regression Potential]
[Original Report]
I try´d today to update my system with "aptitude update && aptitude dist-upgrade -y"
Every time its stick on
Preparing to replace language-pack-en-base 1:10.04+20100422 (using .../language-pack-en-base_1%3a10.04+20100714_all.deb) ...
Unpacking replacement language-pack-en-base ...
when I try to kill the task with "kill -9 9440" I have still no
success.
├─screen(22470)─┬─bash(22471)───aptitude(9407)─┬─dpkg(9440)
│ │ └─{aptitude}(9408)
│ └─bash(22500)───pstree(9460)
only when I kill 9408 I can interupt the command.
My dmesg ist full of curious messages (see file I attach)
ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-generic 2.6.32.24.25
Regression: Yes
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-24.39-generic 2.6.32.15+drm33.5
Uname: Linux 2.6.32-24-generic x86_64
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
Date: Thu Aug 26 20:54:17 2010
MachineType: MSI MS-7522
PciMultimedia:
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-24-generic root=UUID=a2eace54-399d-4efe-bbf2-c76c44d2b6ea ro iommu=soft vga=0x317 nomce quiet splash
ProcEnviron:
LANG=de_DE.UTF-8
SHELL=/bin/bash
SourcePackage: linux
dmi.bios.date: 01/07/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: V8.8
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: MSI X58 Pro-E (MS-7522)
dmi.board.vendor: MSI
dmi.board.version: 3.0
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: MICRO-STAR INTERNATIONAL CO.,LTD
dmi.chassis.version: 3.0
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrV8.8:bd01/07/2010:svnMSI:pnMS-7522:pvr3.0:rvnMSI:rnMSIX58Pro-E(MS-7522):rvr3.0:cvnMICRO-STARINTERNATIONALCO.,LTD:ct3:cvr3.0:
dmi.product.name: MS-7522
dmi.product.version: 3.0
dmi.sys.vendor: MSI
To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/624877/+subscriptions
More information about the foundations-bugs
mailing list