[Bug 986654] Re: disk I/O race condition after update

A. Eibach 986654 at bugs.launchpad.net
Tue Nov 5 11:07:16 UTC 2013


Thanks a lot for your hard work in investigating this issue, very
appreciated!

I've been fighting with this IRQ problem for almost 2 years now.  Same thing as you reported, those random "soft resetting link" messages and drive resets out of the blue.
Unfortunately, no one wants to tackle this issue, which I believe is also a bug deeply rooted in the 3.x kernels.

The only way out _for me_ was just playing dice with the drives: swapping what is on the external PCI IDE/SATA controller, until there are no more errors. It's like human beings: some pairs just would not match. :) 
It can, however, get quite time-consuming with lots of drives, and 4 hours of continuous "reswap-reboot" cycles are not rare at all. But once it works, it will keep working, so it pays off after all :)

BTW I'm sick and tired of hearing "your drive is faulty". NO. IT'S NOT! It will sometimes just work solely on the onboard SATA controller and not on _any_ SATA controller card plugged into the PCI port.
Besides, I am pretty sure that people even tossed out their innocent drives just because that kernel or udev bug (or feature?!) drove them insane.

P.S. bug 978384 either does not exist, was removed or you mistyped the
bug number. Gives me a 404 here...

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to udev in Ubuntu.
https://bugs.launchpad.net/bugs/986654

Title:
  disk I/O race condition after update

Status in “udev” package in Ubuntu:
  New

Bug description:
  About 8 times over this cycle, I have installed various versions of 12.04 server edition on a pathetic old test computer.
  In addition to the regular testiing, the purpose is to verify minimum system specifications.
  The issue raised herein first appeared  with the i386 server ISO of 2012.04.12.
  The issue remains with an up to date system as of 2012.04.21.
  The issue does NOT exist with a fresh install from the i386 server ISO of 2012.03.27, which the most recent preceeding ISO I had.

  The issue: Under very intensive disk I/O situations, the system can
  lock up. Eventually (I think after about 30 seconds, I am actually
  rarely standing beside the computer when this occurs) the system does
  realize it is frozen and manages to resume.  It appears as though the
  computer is waiting for some data from the disk, but the disk doesn't
  think it has anything to do. I.E. they are out of sync. The
  appropriate lines from kern.log will be attached.

  The issue does not appear to be with the kernel itself, because it can
  be created by starting from the fresh install from the 2012.03.27 ISO
  and doing "apt-get update" and "apt-get upgrade" but not "apt-get
  dist-upgrade". I do not know which package introduced the issue, which
  is why I have not been able to run "ubuntu-bug <packagename>" for this
  report. I did list them all before any updates and after, and will
  post both the difference file and my edited difference file, where I
  took my best guess at editing out ones that I didn;t think would be
  contain the root cause.

  Note also bug number 978384, which seems similar but not the same.
  Regardless, the test kernel page does have the verion i would need to
  try.

  For testing for this issue I use "sudo update-apt-xapian-index
  --force", but I have seen the same issue a few times other other heavy
  disk usage conditions.

  This issue has been demonstrated with two older style ATA hard drives.
  Both drives have been health tested with disk test tools and the
  system booted from a freedos ISO.

  The enitre start from a fresh install from the 2012.03.37 ISO and test
  and sow no issue and then upgrade and test and show issue has been
  rpeated several times. This latest test included 8 times running "sudo
  update-apt-xapian-index --force" without any problem on a fresh
  installation and 9 times running ""sudo update-apt-xapian-index
  --force" after only "apt-get update" and "apt-get upgrade" and re-
  booting, thus running the same kernel.

  It is possible that my CPU is the problem, being below the minimum
  server edition specifications (200 Mhz, whereas mininmum spec is 300
  Mhz). However, the CPU is largely idle with these tests, as it mostly
  waits for disk I/O. (O.K., it also does have some pretty busy
  periods.)

  Attachments will be added over the next hour.

  doug at test-smy:~/source-temp$ uname -a
  Linux test-smy 3.2.0-23-generic-pae #36-Ubuntu SMP Tue Apr 10 22:19:09 UTC 2012 i686 i686 i386 GNU/Linux
  doug at test-smy:~/source-temp$ cat /proc/version
  Linux version 3.2.0-23-generic-pae (buildd at palmer) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu4) ) #36-Ubuntu SMP Tue Apr 10 22:19:09 UTC 2012
  doug at test-smy:~/source-temp$ lsb_release -rd
  Description:    Ubuntu 12.04 LTS
  Release:        12.04

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/udev/+bug/986654/+subscriptions



More information about the foundations-bugs mailing list