[3.13.y.z extended stable] Patch "xfs: block allocation work needs to be kswapd aware" has been added to staging queue
Tetsuo Handa
penguin-kernel at I-love.SAKURA.ne.jp
Tue Jul 15 21:44:39 UTC 2014
Kamal Mostafa wrote:
> This is a note to let you know that I have just added a patch titled
>
> xfs: block allocation work needs to be kswapd aware
>
> to the linux-3.13.y-queue branch of the 3.13.y.z extended stable tree
> which can be found at:
>
> http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.13.y-queue
>
> This patch is scheduled to be released in version 3.13.11.5.
>
> If you, or anyone else, feels it should not be added to this tree, please
> reply to this email.
>
This patch should not be added to this tree.
------- Forwarded Message
From: Dave Chinner <david at fromorbit.com>
To: gregkh at linuxfoundation.org
Cc: dchinner at redhat.com, hch at lst.de, penguin-kernel at I-love.SAKURA.ne.jp,stable at vger.kernel.org, stable-commits at vger.kernel.org
Subject: Re: Patch "xfs: block allocation work needs to be kswapd aware" hasbeen added to the 3.14-stable tree
Date: Thu, 3 Jul 2014 11:52:21 +1000
On Wed, Jul 02, 2014 at 04:54:41PM -0700, gregkh at linuxfoundation.org wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> xfs: block allocation work needs to be kswapd aware
>
> to the 3.14-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
Please drop it - this patch is due to be reverted because of
severe performance regressions in low memory situations.
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
------- End of Forwarded Message
> For more information about the 3.13.y.z tree, see
> https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable
>
> Thanks.
> -Kamal
>
> ------
>
> From 3b5d56a67fde780a205831a4e1438b8208d670c3 Mon Sep 17 00:00:00 2001
> From: Dave Chinner <dchinner at redhat.com>
> Date: Fri, 6 Jun 2014 15:59:59 +1000
> Subject: xfs: block allocation work needs to be kswapd aware
>
> commit 1f6d64829db78a7e1d63e15c9f48f0a5d2b5a679 upstream.
>
> Upon memory pressure, kswapd calls xfs_vm_writepage() from
> shrink_page_list(). This can result in delayed allocation occurring
> and that gets deferred to the the allocation workqueue.
>
> The allocation then runs outside kswapd context, which means if it
> needs memory (and it does to demand page metadata from disk) it can
> block in shrink_inactive_list() waiting for IO congestion. These
> blocking waits are normally avoiding in kswapd context, so under
> memory pressure writeback from kswapd can be arbitrarily delayed by
> memory reclaim.
>
> To avoid this, pass the kswapd context to the allocation being done
> by the workqueue, so that memory reclaim understands correctly that
> the work is being done for kswapd and therefore it is not blocked
> and does not delay memory reclaim.
>
> To avoid issues with int->char conversion of flag fields (as noticed
> in v1 of this patch) convert the flag fields in the struct
> xfs_bmalloca to bool types. pahole indicates these variables are
> still single byte variables, so no extra space is consumed by this
> change.
>
> Reported-by: Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp>
> Signed-off-by: Dave Chinner <dchinner at redhat.com>
> Reviewed-by: Christoph Hellwig <hch at lst.de>
> Signed-off-by: Dave Chinner <david at fromorbit.com>
> Signed-off-by: Kamal Mostafa <kamal at canonical.com>
> ---
> fs/xfs/xfs_bmap_util.c | 16 +++++++++++++---
> fs/xfs/xfs_bmap_util.h | 13 +++++++------
> 2 files changed, 20 insertions(+), 9 deletions(-)
>
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 82e0dab..035d06a 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -258,14 +258,23 @@ xfs_bmapi_allocate_worker(
> struct xfs_bmalloca *args = container_of(work,
> struct xfs_bmalloca, work);
> unsigned long pflags;
> + unsigned long new_pflags = PF_FSTRANS;
>
> - /* we are in a transaction context here */
> - current_set_flags_nested(&pflags, PF_FSTRANS);
> + /*
> + * we are in a transaction context here, but may also be doing work
> + * in kswapd context, and hence we may need to inherit that state
> + * temporarily to ensure that we don't block waiting for memory reclaim
> + * in any way.
> + */
> + if (args->kswapd)
> + new_pflags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD;
> +
> + current_set_flags_nested(&pflags, new_pflags);
>
> args->result = __xfs_bmapi_allocate(args);
> complete(args->done);
>
> - current_restore_flags_nested(&pflags, PF_FSTRANS);
> + current_restore_flags_nested(&pflags, new_pflags);
> }
>
> /*
> @@ -284,6 +293,7 @@ xfs_bmapi_allocate(
>
>
> args->done = &done;
> + args->kswapd = current_is_kswapd();
> INIT_WORK_ONSTACK(&args->work, xfs_bmapi_allocate_worker);
> queue_work(xfs_alloc_wq, &args->work);
> wait_for_completion(&done);
> diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
> index 900747b..f33711d 100644
> --- a/fs/xfs/xfs_bmap_util.h
> +++ b/fs/xfs/xfs_bmap_util.h
> @@ -50,12 +50,13 @@ struct xfs_bmalloca {
> xfs_extlen_t total; /* total blocks needed for xaction */
> xfs_extlen_t minlen; /* minimum allocation size (blocks) */
> xfs_extlen_t minleft; /* amount must be left after alloc */
> - char eof; /* set if allocating past last extent */
> - char wasdel; /* replacing a delayed allocation */
> - char userdata;/* set if is user data */
> - char aeof; /* allocated space at eof */
> - char conv; /* overwriting unwritten extents */
> - char stack_switch;
> + bool eof; /* set if allocating past last extent */
> + bool wasdel; /* replacing a delayed allocation */
> + bool userdata;/* set if is user data */
> + bool aeof; /* allocated space at eof */
> + bool conv; /* overwriting unwritten extents */
> + bool stack_switch;
> + bool kswapd; /* allocation in kswapd context */
> int flags;
> struct completion *done;
> struct work_struct work;
> --
> 1.9.1
>
>
More information about the kernel-team
mailing list