SRU request for LP#208551

Colin Ian King colin.king at canonical.com
Wed Sep 10 09:41:33 UTC 2008


https://bugs.launchpad.net/ubuntu/hardy/+source/linux/+bug/208551

SRU justification:

Impact: mdadm, Raid5 get stuck in uninterruptable sleep under heavy I/O
load. Copying data to a Raid 5 XFS partition results in a permanent lock
on several processes related to it, getting stuck in the D(+) state.
Occurs when large quantities of data (10-40 GB) is copied, resulting in
processes being unkillable, and the system cannot reboot and requires
power cycling the server.

Fix: The patch from commit 6ed3003c19a96fe18edf8179c4be6fe14abbebbc. The
fix is to not make any generic_make_request() calls in raid5
make_request until all waiting has been done.  We do this by simply
setting STRIPE_HANDLE instead of calling handle_stripe(). This causes a
performance hit, so this patch also only calls raid5_activate_delayed()
at unplug time, never in raid5.  This seems to bring back the
performance numbers. [quoting the commit message]

Testing: Without the patch, Raid 5 using md on an XFS filesystem locks
up under heavy data copying - this is repeatable. With the patch, the
lock up does not occur.

Patch tested in my PPA by Andrew Cholakian
https://bugs.launchpad.net/ubuntu/hardy/+source/linux/+bug/208551/comments/16
on 2 64 bit servers.

Patch attached.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-UBUNTU-md-fix-an-occasional-deadlock-in-raid5.patch
Type: text/x-vhdl
Size: 1710 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20080910/d26a6c00/attachment.bin>


More information about the kernel-team mailing list