ACK: [SRU][X/B/D/E] [PATCH 0/1] PM / hibernate: fix potential memory corruption

Colin Ian King colin.king at canonical.com
Mon Oct 7 16:45:35 UTC 2019


On 07/10/2019 17:35, Andrea Righi wrote:
> BugLink: https://bugs.launchpad.net/bugs/1847118
> 
> [Impact]
> 
> A caching bug in the hibernation code can lead to potential memory
> corruptions on resume.
> 
> The hibernation code is representing all the allocated pages in memory
> (pfn) using a list of extents, inside each extent it uses a radix tree
> and each node in the tree contains a bitmap. This structure is used to
> save the memory image to disk.
> 
> To speed up lookups in this structure the kernel is caching the position
> of the previous lookup in the form (current_extent, current_node).
> However, if two consecutive lookups are distant enough from each other,
> the extent can change, but the kernel can still use the cached node
> (current_node), accessing the wrong bitmap and ending up saving to disk
> the wrong pfn's.
> 
> [Test Case]
> 
> Bug has been reproduced in Xenial and Bionic trying to hibernate a large
> instance with a lot of RAM (100GB+).
> 
> But we also wrote a custom kernel module to better isolate the code that
> triggers the problem: https://code.launchpad.net/~arighi/+git/mybitmap
> 
> This module has exactly the same code as the hibernation code, but it
> can be used as a fast test case to reproduce the problem without
> actually triggering a real hibernation/resume cycle.
> 
> [Fix]
> 
> This bug can be fixed by properly invalidating the cached pair (extent,
> node) when the next lookup falls in a different extent or a different
> node.
> 
> [Regression Potential]
> 
> The fix has been sent to the LKML for review/feedback
> (https://lkml.org/lkml/2019/9/25/393), we have not received any feedback
> so far, but the bug is pretty clear and well tested on the affected
> platforms. Moreover, the code is isolated to the hibernation area, so
> the overall regression potential is minimal.
> 
> ----------------------------------------------------------------
> Andy Whitcroft (1):
>       PM / hibernate: memory_bm_find_bit -- tighten node optimisation
> 
>  kernel/power/snapshot.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> 

I've been following this issue, the fix looks good to me.  Perhaps we
need to ping upstream again on this before the fix gets lost.

Acked-by: Colin Ian King <colin.king at canonical.com>

Colin



More information about the kernel-team mailing list