ACK: [SRU][X/B/D/E] [PATCH 0/1] PM / hibernate: fix potential memory corruption
Colin Ian King
colin.king at canonical.com
Mon Oct 7 16:45:35 UTC 2019
On 07/10/2019 17:35, Andrea Righi wrote:
> BugLink: https://bugs.launchpad.net/bugs/1847118
>
> [Impact]
>
> A caching bug in the hibernation code can lead to potential memory
> corruptions on resume.
>
> The hibernation code is representing all the allocated pages in memory
> (pfn) using a list of extents, inside each extent it uses a radix tree
> and each node in the tree contains a bitmap. This structure is used to
> save the memory image to disk.
>
> To speed up lookups in this structure the kernel is caching the position
> of the previous lookup in the form (current_extent, current_node).
> However, if two consecutive lookups are distant enough from each other,
> the extent can change, but the kernel can still use the cached node
> (current_node), accessing the wrong bitmap and ending up saving to disk
> the wrong pfn's.
>
> [Test Case]
>
> Bug has been reproduced in Xenial and Bionic trying to hibernate a large
> instance with a lot of RAM (100GB+).
>
> But we also wrote a custom kernel module to better isolate the code that
> triggers the problem: https://code.launchpad.net/~arighi/+git/mybitmap
>
> This module has exactly the same code as the hibernation code, but it
> can be used as a fast test case to reproduce the problem without
> actually triggering a real hibernation/resume cycle.
>
> [Fix]
>
> This bug can be fixed by properly invalidating the cached pair (extent,
> node) when the next lookup falls in a different extent or a different
> node.
>
> [Regression Potential]
>
> The fix has been sent to the LKML for review/feedback
> (https://lkml.org/lkml/2019/9/25/393), we have not received any feedback
> so far, but the bug is pretty clear and well tested on the affected
> platforms. Moreover, the code is isolated to the hibernation area, so
> the overall regression potential is minimal.
>
> ----------------------------------------------------------------
> Andy Whitcroft (1):
> PM / hibernate: memory_bm_find_bit -- tighten node optimisation
>
> kernel/power/snapshot.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
>
I've been following this issue, the fix looks good to me. Perhaps we
need to ping upstream again on this before the fix gets lost.
Acked-by: Colin Ian King <colin.king at canonical.com>
Colin
More information about the kernel-team
mailing list