[3.13.y.z extended stable] Patch "xfs: block allocation work needs to be kswapd aware" has been added to staging queue

Tetsuo Handa penguin-kernel at I-love.SAKURA.ne.jp
Tue Jul 15 21:44:39 UTC 2014


Kamal Mostafa wrote:
> This is a note to let you know that I have just added a patch titled
> 
>     xfs: block allocation work needs to be kswapd aware
> 
> to the linux-3.13.y-queue branch of the 3.13.y.z extended stable tree 
> which can be found at:
> 
>  http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.13.y-queue
> 
> This patch is scheduled to be released in version 3.13.11.5.
> 
> If you, or anyone else, feels it should not be added to this tree, please 
> reply to this email.
> 

This patch should not be added to this tree.

------- Forwarded Message
From: Dave Chinner <david at fromorbit.com>
To: gregkh at linuxfoundation.org
Cc: dchinner at redhat.com, hch at lst.de, penguin-kernel at I-love.SAKURA.ne.jp,stable at vger.kernel.org, stable-commits at vger.kernel.org
Subject: Re: Patch "xfs: block allocation work needs to be kswapd aware" hasbeen added to the 3.14-stable tree
Date: Thu, 3 Jul 2014 11:52:21 +1000

On Wed, Jul 02, 2014 at 04:54:41PM -0700, gregkh at linuxfoundation.org wrote:
> 
> This is a note to let you know that I've just added the patch titled
> 
>     xfs: block allocation work needs to be kswapd aware
> 
> to the 3.14-stable tree which can be found at:
>     http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

Please drop it - this patch is due to be reverted because of
severe performance regressions in low memory situations.

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com


------- End of Forwarded Message

> For more information about the 3.13.y.z tree, see
> https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable
> 
> Thanks.
> -Kamal
> 
> ------
> 
> From 3b5d56a67fde780a205831a4e1438b8208d670c3 Mon Sep 17 00:00:00 2001
> From: Dave Chinner <dchinner at redhat.com>
> Date: Fri, 6 Jun 2014 15:59:59 +1000
> Subject: xfs: block allocation work needs to be kswapd aware
> 
> commit 1f6d64829db78a7e1d63e15c9f48f0a5d2b5a679 upstream.
> 
> Upon memory pressure, kswapd calls xfs_vm_writepage() from
> shrink_page_list(). This can result in delayed allocation occurring
> and that gets deferred to the the allocation workqueue.
> 
> The allocation then runs outside kswapd context, which means if it
> needs memory (and it does to demand page metadata from disk) it can
> block in shrink_inactive_list() waiting for IO congestion. These
> blocking waits are normally avoiding in kswapd context, so under
> memory pressure writeback from kswapd can be arbitrarily delayed by
> memory reclaim.
> 
> To avoid this, pass the kswapd context to the allocation being done
> by the workqueue, so that memory reclaim understands correctly that
> the work is being done for kswapd and therefore it is not blocked
> and does not delay memory reclaim.
> 
> To avoid issues with int->char conversion of flag fields (as noticed
> in v1 of this patch) convert the flag fields in the struct
> xfs_bmalloca to bool types. pahole indicates these variables are
> still single byte variables, so no extra space is consumed by this
> change.
> 
> Reported-by: Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp>
> Signed-off-by: Dave Chinner <dchinner at redhat.com>
> Reviewed-by: Christoph Hellwig <hch at lst.de>
> Signed-off-by: Dave Chinner <david at fromorbit.com>
> Signed-off-by: Kamal Mostafa <kamal at canonical.com>
> ---
>  fs/xfs/xfs_bmap_util.c | 16 +++++++++++++---
>  fs/xfs/xfs_bmap_util.h | 13 +++++++------
>  2 files changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 82e0dab..035d06a 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -258,14 +258,23 @@ xfs_bmapi_allocate_worker(
>  	struct xfs_bmalloca	*args = container_of(work,
>  						struct xfs_bmalloca, work);
>  	unsigned long		pflags;
> +	unsigned long		new_pflags = PF_FSTRANS;
> 
> -	/* we are in a transaction context here */
> -	current_set_flags_nested(&pflags, PF_FSTRANS);
> +	/*
> +	 * we are in a transaction context here, but may also be doing work
> +	 * in kswapd context, and hence we may need to inherit that state
> +	 * temporarily to ensure that we don't block waiting for memory reclaim
> +	 * in any way.
> +	 */
> +	if (args->kswapd)
> +		new_pflags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD;
> +
> +	current_set_flags_nested(&pflags, new_pflags);
> 
>  	args->result = __xfs_bmapi_allocate(args);
>  	complete(args->done);
> 
> -	current_restore_flags_nested(&pflags, PF_FSTRANS);
> +	current_restore_flags_nested(&pflags, new_pflags);
>  }
> 
>  /*
> @@ -284,6 +293,7 @@ xfs_bmapi_allocate(
> 
> 
>  	args->done = &done;
> +	args->kswapd = current_is_kswapd();
>  	INIT_WORK_ONSTACK(&args->work, xfs_bmapi_allocate_worker);
>  	queue_work(xfs_alloc_wq, &args->work);
>  	wait_for_completion(&done);
> diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
> index 900747b..f33711d 100644
> --- a/fs/xfs/xfs_bmap_util.h
> +++ b/fs/xfs/xfs_bmap_util.h
> @@ -50,12 +50,13 @@ struct xfs_bmalloca {
>  	xfs_extlen_t		total;	/* total blocks needed for xaction */
>  	xfs_extlen_t		minlen;	/* minimum allocation size (blocks) */
>  	xfs_extlen_t		minleft; /* amount must be left after alloc */
> -	char			eof;	/* set if allocating past last extent */
> -	char			wasdel;	/* replacing a delayed allocation */
> -	char			userdata;/* set if is user data */
> -	char			aeof;	/* allocated space at eof */
> -	char			conv;	/* overwriting unwritten extents */
> -	char			stack_switch;
> +	bool			eof;	/* set if allocating past last extent */
> +	bool			wasdel;	/* replacing a delayed allocation */
> +	bool			userdata;/* set if is user data */
> +	bool			aeof;	/* allocated space at eof */
> +	bool			conv;	/* overwriting unwritten extents */
> +	bool			stack_switch;
> +	bool			kswapd;	/* allocation in kswapd context */
>  	int			flags;
>  	struct completion	*done;
>  	struct work_struct	work;
> --
> 1.9.1
> 
> 




More information about the kernel-team mailing list