Jaunty SRU #348836 - ext4: fix bb_prealloc_list corruption due to wrong group locking

Tim Gardner timg at tpi.com
Tue Apr 14 16:43:41 UTC 2009


The following changes since commit 1901c1b176e0d8c62aeb1fd9512c45bc6010b180:
  Eric Sandeen (1):
        ext4: fix bb_prealloc_list corruption due to wrong group locking

are available in the git repository at:

  git://kernel.ubuntu.com/rtg/ubuntu-jaunty lp348836

>From 1901c1b176e0d8c62aeb1fd9512c45bc6010b180 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen at redhat.com>
Date: Mon, 16 Mar 2009 23:25:40 -0400
Subject: [PATCH] ext4: fix bb_prealloc_list corruption due to wrong group locking

Bug: #348836

This is for Red Hat bug 490026: EXT4 panic, list corruption in
ext4_mb_new_inode_pa

ext4_lock_group(sb, group) is supposed to protect this list for
each group, and a common code flow to remove an album is like
this:

    ext4_get_group_no_and_offset(sb, pa->pa_pstart, &grp, NULL);
    ext4_lock_group(sb, grp);
    list_del(&pa->pa_group_list);
    ext4_unlock_group(sb, grp);

so it's critical that we get the right group number back for
this prealloc context, to lock the right group (the one
associated with this pa) and prevent concurrent list manipulation.

however, ext4_mb_put_pa() passes in (pa->pa_pstart - 1) with a
comment, "-1 is to protect from crossing allocation group".

This makes sense for the group_pa, where pa_pstart is advanced
by the length which has been used (in ext4_mb_release_context()),
and when the entire length has been used, pa_pstart has been
advanced to the first block of the next group.

However, for inode_pa, pa_pstart is never advanced; it's just
set once to the first block in the group and not moved after
that.  So in this case, if we subtract one in ext4_mb_put_pa(),
we are actually locking the *previous* group, and opening the
race with the other threads which do not subtract off the extra
block.

Signed-off-by: Eric Sandeen <sandeen at redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso at mit.edu>
Signed-off-by: Tim Gardner <tim.gardner at canonical.com>
---
 fs/ext4/mballoc.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index f8e923f..add854a 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3586,6 +3586,7 @@ static void ext4_mb_put_pa(struct ext4_allocation_context *ac,
 			struct super_block *sb, struct ext4_prealloc_space *pa)
 {
 	unsigned long grp;
+	ext4_fsblk_t grp_blk;
 
 	if (!atomic_dec_and_test(&pa->pa_count) || pa->pa_free != 0)
 		return;
@@ -3600,8 +3601,12 @@ static void ext4_mb_put_pa(struct ext4_allocation_context *ac,
 	pa->pa_deleted = 1;
 	spin_unlock(&pa->pa_lock);
 
-	/* -1 is to protect from crossing allocation group */
-	ext4_get_group_no_and_offset(sb, pa->pa_pstart - 1, &grp, NULL);
+	grp_blk = pa->pa_pstart;
+	/* If linear, pa_pstart may be in the next group when pa is used up */
+	if (pa->pa_linear)
+		grp_blk--;
+
+	ext4_get_group_no_and_offset(sb, grp_blk, &grp, NULL);
 
 	/*
 	 * possible race:
-- 
1.5.6.3





More information about the kernel-team mailing list