ACK: [SRU][N][O][PATCH 0/1] MGLRU: kswapd uses 100% CPU when MGLRU is enabled and under memory pressure

Fri Nov 22 09:56:29 UTC 2024

2024-11-22 00:06 CET, Matthew Ruffell:
> BugLink: https://bugs.launchpad.net/bugs/2087886
> 
> [Impact]
> 
> On systems with MGLRU enabled, which it is by default, if the system is under
> memory pressure, and some pages are then allocated, such that it wakes up kswapd
> to attempt page reclaim, but the system has enough memory that kswapd doesn't
> get OOM killed, and then there are no pages to actually reclaim, kswapd can spin
> at 100% endlessly, causing severe performance issues.
> 
> What's happening is that lru_gen_shrink_node() unconditionally clears 
> kswapd_failures, which can prevent kswapd from sleeping and cause 100% kswapd 
> cpu usage even when kswapd repeatedly fails to make progress in reclaim.
> 
> A workaround is to disable MGLRU, and the issue does not occur.
> 
> [Fix]
> 
> The fix is to only clear kswap_failures in lru_gen_shrink_node() if reclaim 
> makes some progress, similar to shrink_node().
> 
> This was fixed in 6.12-rc4 by the commit:
> 
> commit b130ba4a6259f6b64d8af15e9e7ab1e912bcb7ad
> Author: Wei Xu <weixugc at google.com>
> Date:   Mon Oct 14 22:12:11 2024 +0000
> Subject: mm/mglru: only clear kswapd_failures if reclaimable
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b130ba4a6259f6b64d8af15e9e7ab1e912bcb7ad
> 
> Both Noble and Oracular need this fix.
> 
> [Testcase]
> 
> The systems with this issue are DPDK compute nodes running OpenStack Yoga on
> Jammy. If you leave them for several days, they will hit this issue and their
> kswap processes will go to 100% and never drop.
> 
>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
>     846 root      20   0       0      0      0 R 100.0   0.0   1915:16 kswapd0
>     846 root      20   0       0      0      0 R 100.0   0.0   1915:16 kswapd0
>     846 root      20   0       0      0      0 R 100.0   0.0   1915:17 kswapd0
>     846 root      20   0       0      0      0 R  99.0   0.0   1915:18 kswapd0
> 
> If you disable MGLRU however, they drop to 0% and sleep immediately.
> 
> There is a test kernel available in the following ppa:
> 
> https://launchpad.net/~mruffell/+archive/ubuntu/sf390959-test
> 
> If you install the test kernel, the system should remain stable, and kswapd will
> not go to 100% cpu after some time.
> 
> [Where problems could occur]
> 
> We are adding a small requirement to clearing kswapd_failures, that we actually
> reclaim some memory. If we do actually reclaim some memory, the behaviour is the
> same as what we have now, we clear kswapd_failures and continue. 
> 
> If we don't manage to reclaim any memory, we leave kswapd_failures as is, and
> let kswapd sleep, to try again at a later time.
> 
> If a regression were to occur, it would affect MGLRU users, which is the default.
> A regression would look like issues with kswapd memory reclaim, or lead to high
> cpu usage with kswapd.
> 
> Wei Xu (1):
>   mm/mglru: only clear kswapd_failures if reclaimable
> 
>  mm/vmscan.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Acked-by: Agathe Porte <agathe.porte at canonical.com>