APPLIED: [SRU][X][PATCH 0/1] LP: #1750038 - stuck in D state in NFS op

Tue May 15 10:42:02 UTC 2018

On 05/08/18 08:34, Daniel Axtens wrote:
> From: Daniel Axtens <daniel.axtens at canonical.com>
> 
> == SRU Justification ==
> 
> [Impact]
> 
> Occasionally an application gets stuck in "D" state on NFS reads/sync
> and close system calls. All the subsequent operations on the NFS
> mounts are stuck and a reboot is required to rectify the situation.
> 
> [Fix]
> 
> Use GPF_NOIO in some allocations in writeback to avoid a
> deadlock. This is upstream in: ae97aa524ef4 ("NFS: Use GFP_NOIO for
> two allocations in writeback")
> 
> [Testcase]
> See "Test scenario" in the previous description below.
> 
> A test kernel with this patch was tested heavily (>100hrs of test
> suite) without issue.
> 
> [Regression Potential]
> This changes memory allocation in NFS to use a different policy. This
> could potentially affect NFS, or increase the general risk of complex
> OOMs.
> 
> However, the patch is already in Artful and Bionic without issue.
> 
> The patch does not apply to Trusty.
> 
> == Previous Description ==
> 
> Using Ubuntu Xenial user reports processes hang in D state waiting for
> disk io.
> 
> Ocassionally one of the applications gets into "D" state on NFS
> reads/sync and close system calls. based on the kernel backtraces
> seems to be stuck in kmalloc allocation during cleanup of dirty NFS
> pages.
> 
> All the subsequent operations on the NFS mounts are stuck and reboot
> is required to rectify the situation.
> 
> [Test scenario]
> 
> 1) Applications running in Docker environment
> 2) Application have cgroup limits --cpu-shares --memory -shm-limit
> 3) python and C++ based applications (torch and caffe)
> 4) Applications read big lmdb files and write results to NFS shares
> 5) use NFS v3 , hard and fscache is enabled
> 6) now swap space is configured
> 
> This prevents all other I/O activity on that mount to hang.
> 
> we are running into this issue more frequently and identified few
> applications causing this problem.
> 
> As updated in the description, the problem seems to be happening when
> exercising the stack
> 
> try_to_free_mem_cgroup_pages+0xba/0x1a0
> 
> we see this with docker containers with cgroup option --memory
> <USER_SPECIFIED_MEM>.
> 
> whenever there is a deadlock, we see that the process that is hung has
> reached the maximum cgroup limit, multiple times and typically cleans
> up dirty data and caches to bring the usage under the limit.
> 
> This reclaim path happens many times and finally we hit probably a
> race get into deadlock
> 
> Benjamin Coddington (1):
>   NFS: Use GFP_NOIO for two allocations in writeback
> 
>  fs/nfs/pagelist.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 

Applied to xenial/master-next branch.

Thanks,
Kleber