[Bug 627380] [NEW] cfq triggers smbd timeouts

Joshua Coombs josh.coombs at gmail.com
Tue Aug 31 12:28:19 UTC 2010


Public bug reported:

I have Samba installed on a 10.4 amd64 system with a single share on an
ext4 volume.  This share is hit by multiple Windows systems nightly as
they store ntbackup dumps to the box.  I've been chasing random backup
job failures on this machine for awhile now, thinking there was a bug in
Samba but couldn't find an error in the logs or ever saw a crash dump.
If I run the dumps manually during the day, no problem.  The problems
only occurred at night, and the time was random.  When failures occurred
if there were multiple they'd all happen at the same time.  Windows
would only report an error writing.

Digging around I found some discussions online noting cfq causing IO
starvation in some workloads, causing processes to appear to hang for
durations of up to and over 2 minutes.  My samba logs show 1 minute plus
'pauses' in activity right before Windows logs a failure.  Changing to
noop has so far cleared up the failures.

I'm currently running 2.6.35-999-generic #201008021608 (mainline kernel)
due to bug 474089, and have upgraded to the Maverick samba packages and
related libs as part of trying to track this issue down.  I'm writing to
a 2TB SATA drive behind a cciss controller, no RAID.  The problem is
definitely load related so I can only really get one viable test in per
night, and for obvious reasons I can't stick with a broken config for
too many nights in a row, but I'm more than willing to try and gather
whatever test data is needed to improve things.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

-- 
cfq triggers smbd timeouts
https://bugs.launchpad.net/bugs/627380
You received this bug notification because you are a member of Kernel
Bugs, which is subscribed to linux in ubuntu.




More information about the kernel-bugs mailing list