[SRU B][C][PATCH 1/1] UBUNTU: SAUCE: cachefiles: Page leaking in cachefiles_read_backing_file while vmscan is active
Seth Forshee
seth.forshee at canonical.com
Thu Sep 20 08:24:39 UTC 2018
On Thu, Sep 20, 2018 at 03:50:57PM +1000, Daniel Axtens wrote:
> From: Kiran Kumar Modukuri <kiran.modukuri at gmail.com>
>
> BugLink: https://bugs.launchpad.net/bugs/1793430
>
> [Description]
> In a heavily loaded system where the system pagecache is nearing
> memory limits and fscache is enabled, pages can be leaked by fscache
> while trying read pages from cachefiles backend. This can happen
> because two applications can be reading same page from a single mount,
> two threads can be trying to read the backing page at same time. This
> results in one of the thread finding that a page for the backing file
> or netfs file is already in the radix tree. During the error handling
> cachefiles does not cleanup the reference on backing page, leading to
> page leak.
>
> [Fix]
> The fix is straightforward, to decrement the reference when error is
> encounterd.
>
> [Testing]
> I have tested the fix using following method for 12+ hrs.
>
> 1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc <server_ip>:/export /mnt/nfs
> 2) create 10000 files of 2.8MB in a NFS mount.
> 3) start a thread to simulate heavy VM presssure
> (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)&
> 4) start multiple parallel reader for data set at same time
> find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
> find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
> find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
> ..
> ..
> find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
> find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
> 5) finally check using cat /proc/fs/fscache/stats | grep -i pages ;
> free -h , cat /proc/meminfo and page-types -r -b lru
> to ensure all pages are freed.
>
> Reviewed-by: Daniel Axtens <dja at axtens.net>
> Signed-off-by: Shantanu Goel <sgoel01 at yahoo.com>
> Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri at gmail.com>
> [dja: forward ported to current upstream]
> Signed-off-by: Daniel Axtens <dja at axtens.net>
> [applied from
> https://www.redhat.com/archives/linux-cachefs/2018-August/msg00007.html
> This is v2 of the patch. It has sat on the list for weeks without
> any response or forward progress. v1 was first posted in 2014 and
> was reposted this August.]
> Signed-off-by: Daniel Axtens <daniel.axtens at canonical.com>
I'm on the fence about this patch. On the one hand I don't think it's
harmful and does look to legitimately fix some leaks. On the other hand
it also makes changes that don't fix any leaks, afaict. Maybe you can
convince me that I'm incorrect.
> ---
> fs/cachefiles/rdwr.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
> index e09438612607..54d11248910c 100644
> --- a/fs/cachefiles/rdwr.c
> +++ b/fs/cachefiles/rdwr.c
> @@ -274,6 +274,8 @@ static int cachefiles_read_backing_file_one(struct cachefiles_object *object,
> goto installed_new_backing_page;
> if (ret != -EEXIST)
> goto nomem_page;
> + put_page(newpage);
> + newpage = NULL;
> }
I don't think this actually fixes any leaks. On the next loop iteration
it either finds a backing page and goes to backing_page_already_present,
where the reference is dropped, or else it tries again to install
newpage as the backing page. I don't see how a reference is dropped.
On the other hand, I don't see it as being harmful, except that maybe
it as to allocate a page again when it could have used the page from the
previous iteration.
>
> /* we've installed a new backing page, so now we need to start
> @@ -511,6 +513,8 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
> goto installed_new_backing_page;
> if (ret != -EEXIST)
> goto nomem;
> + put_page(newpage);
> + newpage = NULL;
> }
Wow, this has to be the most goto-happy function I have ever seen.
This one does look to prevent leaks.
>
> /* we've installed a new backing page, so now we need
> @@ -535,7 +539,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
> netpage->index, cachefiles_gfp);
> if (ret < 0) {
> if (ret == -EEXIST) {
> + put_page(backpage);
> + backpage = NULL;
This seems right.
> put_page(netpage);
> + netpage = NULL;
Unnecessary but not harmful, possibly good for clarity's sake.
> fscache_retrieval_complete(op, 1);
> continue;
> }
> @@ -608,6 +615,8 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
> netpage->index, cachefiles_gfp);
> if (ret < 0) {
> if (ret == -EEXIST) {
> + put_page(backpage);
> + backpage = NULL;
Seems correct.
> put_page(netpage);
Funny that netpage isn't also assigned NULL here, as it's no different
from the case above where it was.
> fscache_retrieval_complete(op, 1);
> continue;
> --
> 2.17.1
>
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
More information about the kernel-team
mailing list