ACK: [SRU][N][O][PATCH 0/1] MGLRU: kswapd uses 100% CPU when MGLRU is enabled and under memory pressure
Agathe Porte
agathe.porte at canonical.com
Fri Nov 22 09:56:29 UTC 2024
2024-11-22 00:06 CET, Matthew Ruffell:
> BugLink: https://bugs.launchpad.net/bugs/2087886
>
> [Impact]
>
> On systems with MGLRU enabled, which it is by default, if the system is under
> memory pressure, and some pages are then allocated, such that it wakes up kswapd
> to attempt page reclaim, but the system has enough memory that kswapd doesn't
> get OOM killed, and then there are no pages to actually reclaim, kswapd can spin
> at 100% endlessly, causing severe performance issues.
>
> What's happening is that lru_gen_shrink_node() unconditionally clears
> kswapd_failures, which can prevent kswapd from sleeping and cause 100% kswapd
> cpu usage even when kswapd repeatedly fails to make progress in reclaim.
>
> A workaround is to disable MGLRU, and the issue does not occur.
>
> [Fix]
>
> The fix is to only clear kswap_failures in lru_gen_shrink_node() if reclaim
> makes some progress, similar to shrink_node().
>
> This was fixed in 6.12-rc4 by the commit:
>
> commit b130ba4a6259f6b64d8af15e9e7ab1e912bcb7ad
> Author: Wei Xu <weixugc at google.com>
> Date: Mon Oct 14 22:12:11 2024 +0000
> Subject: mm/mglru: only clear kswapd_failures if reclaimable
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b130ba4a6259f6b64d8af15e9e7ab1e912bcb7ad
>
> Both Noble and Oracular need this fix.
>
> [Testcase]
>
> The systems with this issue are DPDK compute nodes running OpenStack Yoga on
> Jammy. If you leave them for several days, they will hit this issue and their
> kswap processes will go to 100% and never drop.
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 846 root 20 0 0 0 0 R 100.0 0.0 1915:16 kswapd0
> 846 root 20 0 0 0 0 R 100.0 0.0 1915:16 kswapd0
> 846 root 20 0 0 0 0 R 100.0 0.0 1915:17 kswapd0
> 846 root 20 0 0 0 0 R 99.0 0.0 1915:18 kswapd0
>
> If you disable MGLRU however, they drop to 0% and sleep immediately.
>
> There is a test kernel available in the following ppa:
>
> https://launchpad.net/~mruffell/+archive/ubuntu/sf390959-test
>
> If you install the test kernel, the system should remain stable, and kswapd will
> not go to 100% cpu after some time.
>
> [Where problems could occur]
>
> We are adding a small requirement to clearing kswapd_failures, that we actually
> reclaim some memory. If we do actually reclaim some memory, the behaviour is the
> same as what we have now, we clear kswapd_failures and continue.
>
> If we don't manage to reclaim any memory, we leave kswapd_failures as is, and
> let kswapd sleep, to try again at a later time.
>
> If a regression were to occur, it would affect MGLRU users, which is the default.
> A regression would look like issues with kswapd memory reclaim, or lead to high
> cpu usage with kswapd.
>
> Wei Xu (1):
> mm/mglru: only clear kswapd_failures if reclaimable
>
> mm/vmscan.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Acked-by: Agathe Porte <agathe.porte at canonical.com>
More information about the kernel-team
mailing list