[Jaunty] Proposing some ext4 patches

Theodore Tso tytso at mit.edu
Mon Jun 22 23:54:49 UTC 2009


On Mon, Jun 22, 2009 at 07:39:12PM +0200, Stefan Bader wrote:
> After we had https://bugs.launchpad.net/bugs/389555 I had a look at Ted's 
> 2.6.28-stable repo to figure out, whether there might be some more 
> dangers lurking. Since his 2.6.28.10 tag there are 21 patches difference 
> between our and his ext4 tree. Quite a few of them are in a gray zone of 
> being of little risk but do not seem to be critical enough to qualify for 
> SRU. This would leave 7 (or 8) which we probably should consider...
> And (question to the SRU team) if we do, could we use one tracking report?
>
> Stefan
>
> 05 (suggest skip)
> http://kernel.ubuntu.com/git?p=smb/ubuntu-jaunty.git;a=commitdiff;h=774c43079ddc04a92030dd31109421518e1fcf14
> Modifications to avoid lock contention.

Sorry, the patch commit message is a bit misleading.  This fixes a
lock ordering problem (detected by lockdep) that could potentially
lead to a system lockup.  The patch was originally designed to fix a
performance problem, but then we discovered it also fixed a lockdep
warning, and we dropped a reference to a kernel bugzilla entry w/o
updating the commit description:

http://bugzilla.kernel.org/show_bug.cgi?id=12787

> 11 (?)
> http://kernel.ubuntu.com/git?p=smb/ubuntu-jaunty.git;a=commitdiff;h=d9ec01eafda7ec7b5fd63b623c86bd95dbd8349a
> Fix to not discard preallocations on close. Not sure of the impact here.

We could not discard preallocations on close if there were any delayed
allocation blocks; this lead to preallocations not getting discarded
until much, much later, which could prevent the block allocator from
not being able to allocate blocks efficiently.   

The patch checks so that once an inode has all of its delayed
allocation blocks allocated, and there are no open r/w
filedescriptors, we discard the preallocated blocks so they can be
used by another file.

This helps to promote a better (less fragmented) layout of block
allocations on disk.  It doesn't fix a critical bug, so this patch is
one that you can decide to skip.

> 20 (maybe skip)
> http://kernel.ubuntu.com/git?p=smb/ubuntu-jaunty.git;a=commitdiff;h=2742d4833ba07f06a18ad2df750b9f3a712864a4
> Use a large (non-zero) block number for delayed allocation buffers. Those 
> should never be written but if this is tried it is more obvious where 
> this comes from.

.... instead of blowing away the boot block / partition table, which
could lead to all sorts of user complaints.  :-)

At the time we weren't completely convinced that the code was
bug-free, but we haven't had any reports of people triggering an
attempted write with the very large non-zero block number, so it's
probably safe to to skip this.


I am suspicious that hard to debug hang described in Launchpad #330824
(Soft lockups when deleting files from ext4 partitions) may very well
be caused by either (a) a failed backport, or (b) an subtle patch
dependency that triggerred a big due to a skipped patch.  I would
therefore encourage you to use the xfstests suite to test the
resulting ext4 filesystem.  The xfstests can be found here:

	url = git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git

I believe you will need the following packages for it to build and
run: xfsprogs, xfslibs-dev, libaio1, libaio-dev.  I might be missing
one or two, but those seem to be the critical ones.  Hopefully any
others will be obvious.  Cd to the top-level of the xfstests
directory, and run ./configure; make.

To run the XFS tests, you will need to have one (and preferably two)
partitions, called TEST and SCRATCH.  TEST should be a mounted ext4
partition that can be mounted and unmounted.  SCRATCH should be an
empty device that you don't mind getting reformatted (many of the
tests will reformat the SCRATCH partition).  SCRATCH is optional; so
if you don't have an 2nd partition, you can simply omit setting the
SCRATCH_DEV and SCRATCH_MNT environment variables.

Set the environment variables:

TEST_DEV    	   device file (i.e., /dev/sda1) containing the TEST partition
TEST_DIR	   mount point of the TEST partition
SCRATCH_DEV	   device file (i.e., /dev/sda2) containing the SCRATCH partition
SCRATCH_MNT	   mount point of the SCRATCH partition

(note TEST_DIR vs. SCRATCH_MNT; don't blame me, blame the SGI
engineers.  :-)

Then run "./check -ext4 -g auto" as root in the top-level xfstests
directory.  Everything should pass; if not, then there's probably
something wrong the the Ubuntu backports.  Bug reports with mainline
kernels should be sent linux-ext4 at vger.kernel.org.  If the mainline
kernel passes, and the Ubuntu backports don't, the sooner it is
detected, the easier it will be to try to find the problem with
bisection searches.

						- Ted






More information about the kernel-team mailing list