[Bug 624877] Re: INFO: task dpkg:23317 blocked for more than 120 seconds.

Theodore Ts'o tytso at mit.edu
Thu Sep 13 21:33:47 UTC 2012


There are two issues here, that interact and so they are confusing
people.  The first is that the kernel has a potential livelock problem
in the writeback code, such that if there are constantly new pages
dirtied that requires writeback, the sync(2) system call will never
return (at least until all of the pages are clean, but on a busy system
with lots of processes writing to the disk that could never happen).
It doesn't happen all of the time sync(2) is called, but since dpkg was
calling sync(2) all the time, it tended to happen there.  Still, this
problem can happen without dpkg being involved at all, and on many
different file systems, since it's a problem with the generic writeback
code.   Trying to backport this fix to the ancient kernel which is in
10.04 is going to be _hard_.   There are people at Red Hat who are paid
the big bucks to do this kind of painful backporting (which in this case
is multiple patches spread across multiple kernel releases before it was
finally fixed, and with all sorts of dependencies).   Good luck finding
a volunteer willing to figure this out.   I wouldn't --- I would much
rather run a 3.x kernel.   And if I had a business that needed to use a
stable enterprise kernel, I'd pay the darned Red Hat or SLES support
fees, and get a professionally managed enterprise kernel.
Unfortunately, in my experience Canonical doesn't have paid kernel
engineers who have either the skill or the bandwidth (not sure which) to
do this kind of very tricky backporting to ancient LTS kernels, as
compared to what Red Hat has done.  I've seen this with ext4 bug fixes
which don't get made to 10.04, but which Red Hat has been willing to do
for their RHEL6 kernel.

Note that this problem is much less likely to hit on desktop/laptop
systems where there generally aren't servers continuously writing to the
file system.   So for most Ubuntu systems that tend not to be production
servers running with highly stressful workloads, this won't be an issue.
The people who are complaining on this Launchpad bug are probably
outliers, which probably explains the priority paid Canonical engineers
have towards doing this kind of backporting.

The second problem/bugfix is the fix to dpkg, which significantly
improves both its performance, and the impact on the system as a whole,
by using sync_file_range() instead of sync().    Fixing this also tends
to remove one of the more common ways of tickling the bug above, but
that's not the only reason why backporting this dpkg package would also
be a good idea, since it speeds up and decreases the overall system
impact of doing package installs.

Or, people could just upgrade their system to Ubuntu LTS 12.04.....

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to dpkg in Ubuntu.
https://bugs.launchpad.net/bugs/624877

Title:
  INFO: task dpkg:23317 blocked for more than 120 seconds.

Status in The Linux Kernel:
  Expired
Status in “dpkg” package in Ubuntu:
  Fix Released
Status in “linux” package in Ubuntu:
  Confirmed
Status in “dpkg” source package in Lucid:
  Triaged
Status in “dpkg” package in Debian:
  Fix Released

Bug description:
  [Impact]

  [Fix]

  [Test Case]

  [Regression Potential]

  [Original Report]
  I try´d today to update my system with "aptitude update && aptitude dist-upgrade -y"

  Every time its stick on

  Preparing to replace language-pack-en-base 1:10.04+20100422 (using .../language-pack-en-base_1%3a10.04+20100714_all.deb) ...
  Unpacking replacement language-pack-en-base ...

  when I try to kill the task with "kill -9 9440" I have still no
  success.

          ├─screen(22470)─┬─bash(22471)───aptitude(9407)─┬─dpkg(9440)
          │               │                              └─{aptitude}(9408)
          │               └─bash(22500)───pstree(9460)

  only when I kill 9408 I can interupt the command.

  My dmesg ist full of curious messages (see file I attach)

  ProblemType: Bug
  DistroRelease: Ubuntu 10.04
  Package: linux-image-generic 2.6.32.24.25
  Regression: Yes
  Reproducible: Yes
  ProcVersionSignature: Ubuntu 2.6.32-24.39-generic 2.6.32.15+drm33.5
  Uname: Linux 2.6.32-24-generic x86_64
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  Date: Thu Aug 26 20:54:17 2010
  MachineType: MSI MS-7522
  PciMultimedia:

  ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-24-generic root=UUID=a2eace54-399d-4efe-bbf2-c76c44d2b6ea ro iommu=soft vga=0x317 nomce quiet splash
  ProcEnviron:
   LANG=de_DE.UTF-8
   SHELL=/bin/bash
  SourcePackage: linux
  dmi.bios.date: 01/07/2010
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: V8.8
  dmi.board.asset.tag: To Be Filled By O.E.M.
  dmi.board.name: MSI X58 Pro-E (MS-7522)
  dmi.board.vendor: MSI
  dmi.board.version: 3.0
  dmi.chassis.asset.tag: To Be Filled By O.E.M.
  dmi.chassis.type: 3
  dmi.chassis.vendor: MICRO-STAR INTERNATIONAL CO.,LTD
  dmi.chassis.version: 3.0
  dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrV8.8:bd01/07/2010:svnMSI:pnMS-7522:pvr3.0:rvnMSI:rnMSIX58Pro-E(MS-7522):rvr3.0:cvnMICRO-STARINTERNATIONALCO.,LTD:ct3:cvr3.0:
  dmi.product.name: MS-7522
  dmi.product.version: 3.0
  dmi.sys.vendor: MSI

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/624877/+subscriptions




More information about the foundations-bugs mailing list