[SRU B][C][PATCH 1/1 v2] UBUNTU: SAUCE: cachefiles: Page leaking in cachefiles_read_backing_file while vmscan is active

Daniel Axtens daniel.axtens at canonical.com
Fri Sep 21 02:59:40 UTC 2018


From: Daniel Axtens <dja at axtens.net>

FYI, Kiran's response hasn't hit the list for some reason, so I've kept
it in full below. Perhaps it's stuck in moderation.


I found I was getting confused between the
cachefiles_read_backing_file_one() function, which was the first hunk of
the original patch, and cachefiles_read_backing_file(), which has the
other 3 hunks of the original patch.

(For clarity, I'm referring to this bit:

--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -274,6 +274,8 @@ static int cachefiles_read_backing_file_one(struct cachefiles_object *object,
                        goto installed_new_backing_page;
                if (ret != -EEXIST)
                        goto nomem_page;
+               put_page(newpage);
+               newpage = NULL;
        }

)

In cachefiles_read_backing_file(), there is an equivalent section to
this already - this just brings the two functions in to alignment.

Seth's question was whether or not that would make a difference. The
case we are concerned about is what happens if you reach the end of the
loop, the find a backing page and goto backing_page_already_present.

In cachefiles_read_backing_file_one(), backing_page_already_present
starts around line 317, and one of the first things it does is check if
newpage is not NULL and puts it if so. So we don't have a leak.

On the other hand, the parallel backing_page_already_present in
cachefiles_read_backing_file() (around line 577) does *not* free newpage
if it exists. However, we also don't have a leak here as the loop in
cachefiles_read_backing_file() *does* put_page(newpage) unconditionally
at the end of the loop. Confusing!

While it would be desirable to bring these functions into alignment, we
can do that upstream rather than in a stable release. So I think we can
safely drop the changes to cachefiles_read_backing_file_one() and just
focus on cachefiles_read_backing_file().

If we're just focussing on that function, we can safely leave all 3 of
the original hunks from patch v1. I agree that some of the sets of NULL
are not strictly necessary and some are missing where you'd expect
them. Ideally I'd like to keep the delta between what we have here and
what we're hoping will go upstream to a minimum, so how does the
following look as both a patch for Ubuntu and a patch for upstream? 


diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index e09438612607..4d03a0c93eda 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -511,6 +511,8 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
                                goto installed_new_backing_page;
                        if (ret != -EEXIST)
                                goto nomem;
+                       put_page(newpage);
+                       newpage = NULL;
                }
 
                /* we've installed a new backing page, so now we need
@@ -535,7 +537,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
                                            netpage->index, cachefiles_gfp);
                if (ret < 0) {
                        if (ret == -EEXIST) {
+                               put_page(backpage);
+                               backpage = NULL;
                                put_page(netpage);
+                               netpage = NULL;
                                fscache_retrieval_complete(op, 1);
                                continue;
                        }
@@ -608,7 +613,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
                                            netpage->index, cachefiles_gfp);
                if (ret < 0) {
                        if (ret == -EEXIST) {
+                               put_page(backpage);
+                               backpage = NULL;
                                put_page(netpage);
+                               netpage = NULL;
                                fscache_retrieval_complete(op, 1);
                                continue;
                        }

If it looks OK to both of you I'll do another backport and submission.

Regards,
Daniel

kiran.modukuri at gmail.com writes:

> From: kmodukuri <kmodukuri at nvidia.com>
>
> BugLink: https://bugs.launchpad.net/bugs/1793430
>
> v2 update: Addressed Seth, Forshee's comment about the unecessary freeing of newpage in the for loop, by restricting the freeing to the case when the
> code jumps to "backing_page_already_present" in function cachefiles_read_backing_file.
>
> [Description]
> In a heavily loaded system where the system pagecache is nearing memory limits and fscache is enabled,
> pages can be leaked by fscache while trying read pages from cachefiles backend.
> This can happen because two applications can be reading same page from a single mount,
> two threads can be trying to read the backing page at same time. This results in one of the thread
> finding that a page for the backing file or netfs file is already in the radix tree. During the error
> handling cachefiles does not cleanup the reference on backing page, leading to page leak.
>
> [Fix]
> The fix is straightforward, to decrement the reference when error is encounterd.
>
> [Testing]
> I have tested the fix using following method for 12+ hrs.
>
> 1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc <server_ip>:/export /mnt/nfs
> 2) create 10000 files of 2.8MB in a NFS mount.
> 3) start a thread to simulate heavy VM presssure
>    (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)&
> 4) start multiple parallel reader for data set at same time
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    ..
>    ..
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
> 5) finally check using cat /proc/fs/fscache/stats | grep -i pages ;
>    free -h , cat /proc/meminfo and page-types -r -b lru
>    to ensure all pages are freed.
>
> Signed-off-by: Shantanu Goel <sgoel01 at yahoo.com>
> Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri at gmail.com>
> [dja: forward ported to current upstream]
> Signed-off-by: Daniel Axtens <dja at axtens.net>
> ---
>  fs/cachefiles/rdwr.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
> index 40f7595..b33a4b6 100644
> --- a/fs/cachefiles/rdwr.c
> +++ b/fs/cachefiles/rdwr.c
> @@ -535,7 +535,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
>  					    netpage->index, cachefiles_gfp);
>  		if (ret < 0) {
>  			if (ret == -EEXIST) {
> +				put_page(backpage);
> +				backpage = NULL;
>  				put_page(netpage);
> +				netpage = NULL;
>  				fscache_retrieval_complete(op, 1);
>  				continue;
>  			}
> @@ -570,6 +573,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
>  		/* if the backing page is already present, it can be in one of
>  		 * three states: read in progress, read failed or read okay */
>  	backing_page_already_present:
> +		if (newpage) {
> +			put_page(newpage);
> +			newpage = NULL;
> +		}
>  		_debug("- present %p", backpage);
>  
>  		if (PageError(backpage))
> @@ -608,6 +615,8 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
>  					    netpage->index, cachefiles_gfp);
>  		if (ret < 0) {
>  			if (ret == -EEXIST) {
> +				put_page(backpage);
> +				backpage = NULL;
>  				put_page(netpage);
>  				fscache_retrieval_complete(op, 1);
>  				continue;
> -- 
> 2.7.4




More information about the kernel-team mailing list