[3.13.y.z extended stable] Patch "xfs: block allocation work needs to be kswapd aware" has been added to staging queue
Kamal Mostafa
kamal at canonical.com
Tue Jul 15 21:29:34 UTC 2014
This is a note to let you know that I have just added a patch titled
xfs: block allocation work needs to be kswapd aware
to the linux-3.13.y-queue branch of the 3.13.y.z extended stable tree
which can be found at:
http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.13.y-queue
This patch is scheduled to be released in version 3.13.11.5.
If you, or anyone else, feels it should not be added to this tree, please
reply to this email.
For more information about the 3.13.y.z tree, see
https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable
Thanks.
-Kamal
------
>From 3b5d56a67fde780a205831a4e1438b8208d670c3 Mon Sep 17 00:00:00 2001
From: Dave Chinner <dchinner at redhat.com>
Date: Fri, 6 Jun 2014 15:59:59 +1000
Subject: xfs: block allocation work needs to be kswapd aware
commit 1f6d64829db78a7e1d63e15c9f48f0a5d2b5a679 upstream.
Upon memory pressure, kswapd calls xfs_vm_writepage() from
shrink_page_list(). This can result in delayed allocation occurring
and that gets deferred to the the allocation workqueue.
The allocation then runs outside kswapd context, which means if it
needs memory (and it does to demand page metadata from disk) it can
block in shrink_inactive_list() waiting for IO congestion. These
blocking waits are normally avoiding in kswapd context, so under
memory pressure writeback from kswapd can be arbitrarily delayed by
memory reclaim.
To avoid this, pass the kswapd context to the allocation being done
by the workqueue, so that memory reclaim understands correctly that
the work is being done for kswapd and therefore it is not blocked
and does not delay memory reclaim.
To avoid issues with int->char conversion of flag fields (as noticed
in v1 of this patch) convert the flag fields in the struct
xfs_bmalloca to bool types. pahole indicates these variables are
still single byte variables, so no extra space is consumed by this
change.
Reported-by: Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp>
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Dave Chinner <david at fromorbit.com>
Signed-off-by: Kamal Mostafa <kamal at canonical.com>
---
fs/xfs/xfs_bmap_util.c | 16 +++++++++++++---
fs/xfs/xfs_bmap_util.h | 13 +++++++------
2 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 82e0dab..035d06a 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -258,14 +258,23 @@ xfs_bmapi_allocate_worker(
struct xfs_bmalloca *args = container_of(work,
struct xfs_bmalloca, work);
unsigned long pflags;
+ unsigned long new_pflags = PF_FSTRANS;
- /* we are in a transaction context here */
- current_set_flags_nested(&pflags, PF_FSTRANS);
+ /*
+ * we are in a transaction context here, but may also be doing work
+ * in kswapd context, and hence we may need to inherit that state
+ * temporarily to ensure that we don't block waiting for memory reclaim
+ * in any way.
+ */
+ if (args->kswapd)
+ new_pflags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD;
+
+ current_set_flags_nested(&pflags, new_pflags);
args->result = __xfs_bmapi_allocate(args);
complete(args->done);
- current_restore_flags_nested(&pflags, PF_FSTRANS);
+ current_restore_flags_nested(&pflags, new_pflags);
}
/*
@@ -284,6 +293,7 @@ xfs_bmapi_allocate(
args->done = &done;
+ args->kswapd = current_is_kswapd();
INIT_WORK_ONSTACK(&args->work, xfs_bmapi_allocate_worker);
queue_work(xfs_alloc_wq, &args->work);
wait_for_completion(&done);
diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
index 900747b..f33711d 100644
--- a/fs/xfs/xfs_bmap_util.h
+++ b/fs/xfs/xfs_bmap_util.h
@@ -50,12 +50,13 @@ struct xfs_bmalloca {
xfs_extlen_t total; /* total blocks needed for xaction */
xfs_extlen_t minlen; /* minimum allocation size (blocks) */
xfs_extlen_t minleft; /* amount must be left after alloc */
- char eof; /* set if allocating past last extent */
- char wasdel; /* replacing a delayed allocation */
- char userdata;/* set if is user data */
- char aeof; /* allocated space at eof */
- char conv; /* overwriting unwritten extents */
- char stack_switch;
+ bool eof; /* set if allocating past last extent */
+ bool wasdel; /* replacing a delayed allocation */
+ bool userdata;/* set if is user data */
+ bool aeof; /* allocated space at eof */
+ bool conv; /* overwriting unwritten extents */
+ bool stack_switch;
+ bool kswapd; /* allocation in kswapd context */
int flags;
struct completion *done;
struct work_struct work;
--
1.9.1
More information about the kernel-team
mailing list