[Lucid] SRU: Fix LVM snapshot regression

Fri Aug 6 14:04:02 UTC 2010

On 08/06/2010 03:36 AM, Stefan Bader wrote:
> SRU Justification:
>
> Impact: Changes to ext4 which are part of 2.6.32.17 and we took in advance to
> fix other issues are causing a lockup regression when working with lvm snap-
> shots.
>
> Fix: This patch, which is slowly making its way upstream, was verified to fix
> this problem and is also small and contained enough to be reasonably save.
> Under normal circumstances we would wait for this patch to appear upstream
> but as this is a more serious regression and we are planning to do a limited
> change upload to Lucid to address some priority problems I think we should
> consider this patch even if it does not make it upstream in time.
> For Maverick the same problem exists but it is probably enough time to wait
> and this is marked to be reviewed before beta as well.
> Should the patch go upstream before doing the Lucid upload, references will
> be updated.
>
> Testcase:
> Use lvm to create a LV, mount it and have it actively doing IO, then try
> to create a snapshot will hang without further IO being possible.
>
> -Stefan
>
>  From 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f Mon Sep 17 00:00:00 2001
> From: Eric Sandeen<sandeen at sandeen.net>
> Date: Sun, 1 Aug 2010 17:33:29 -0400
> Subject: [PATCH] (pre-stable) ext4: fix freeze deadlock under IO
>
> BugLink: http://bugs.launchpad.net/bugs/595489
>
> Commit 6b0310fbf087ad6 caused a regression resulting in deadlocks
> when freezing a filesystem which had active IO; the vfs_check_frozen
> level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing
> through.  Duh.
>
> Changing the test to FREEZE_TRANS should let the normal freeze
> syncing get through the fs, but still block any transactions from
> starting once the fs is completely frozen.
>
> I tested this by running fsstress in the background while periodically
> snapshotting the fs and running fsck on the result.  I ran into
> occasional deadlocks, but different ones.  I think this is a
> fine fix for the problem at hand, and the other deadlocky things
> will need more investigation.
>
> Reported-by: Phillip Susi<psusi at cfl.rr.com>
> Signed-off-by: Eric Sandeen<sandeen at redhat.com>
> Signed-off-by: "Theodore Ts'o"<tytso at mit.edu>
> (cherry-picked from commit 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f linux-next)
> Signed-off-by: Stefan Bader<stefan.bader at canonical.com>
> ---
>   fs/ext4/super.c |    4 ++--
>   1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index e046eba..282a270 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -241,7 +241,7 @@ handle_t *ext4_journal_start_sb(struct super_block *sb, int nblocks)
>   	if (sb->s_flags&  MS_RDONLY)
>   		return ERR_PTR(-EROFS);
>
> -	vfs_check_frozen(sb, SB_FREEZE_WRITE);
> +	vfs_check_frozen(sb, SB_FREEZE_TRANS);
>   	/* Special case here: if the journal has aborted behind our
>   	 * backs (eg. EIO in the commit thread), then we still need to
>   	 * take the FS itself readonly cleanly. */
> @@ -3608,7 +3608,7 @@ int ext4_force_commit(struct super_block *sb)
>
>   	journal = EXT4_SB(sb)->s_journal;
>   	if (journal) {
> -		vfs_check_frozen(sb, SB_FREEZE_WRITE);
> +		vfs_check_frozen(sb, SB_FREEZE_TRANS);
>   		ret = ext4_journal_force_commit(journal);
>   	}
>

Acked-by: Tim Gardner <tim.gardner at canonical;.com>

-- 
Tim Gardner tim.gardner at canonical.com