[PATCH 3.16.y-ckt 179/183] proc/pagemap: walk page tables under pte lock

Luis Henriques luis.henriques at canonical.com
Fri Mar 6 09:57:50 UTC 2015

3.16.7-ckt8 -stable review patch.  If anyone has any objections, please let me know.


From: Konstantin Khlebnikov <khlebnikov at yandex-team.ru>

commit 05fbf357d94152171bc50f8a369390f1f16efd89 upstream.

Lockless access to pte in pagemap_pte_range() might race with page
migration and trigger BUG_ON(!PageLocked()) in migration_entry_to_page():

CPU A (pagemap)                           CPU B (migration)
                                          try_to_unmap(page, TTU_MIGRATION...)
<read *pte>

Also lockless read might be non-atomic if pte is larger than wordsize.
Other pte walkers (smaps, numa_maps, clear_refs) already lock ptes.

Fixes: 052fb0d635df ("proc: report file/anon bit in /proc/pid/pagemap")
Signed-off-by: Konstantin Khlebnikov <khlebnikov at yandex-team.ru>
Reported-by: Andrey Ryabinin <a.ryabinin at samsung.com>
Reviewed-by: Cyrill Gorcunov <gorcunov at openvz.org>
Acked-by: Naoya Horiguchi <n-horiguchi at ah.jp.nec.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques at canonical.com>
 fs/proc/task_mmu.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 143674aef97c..a6b314919d9d 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1003,7 +1003,7 @@ static int pagemap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
 	struct vm_area_struct *vma;
 	struct pagemapread *pm = walk->private;
 	spinlock_t *ptl;
-	pte_t *pte;
+	pte_t *pte, *orig_pte;
 	int err = 0;
 	/* find the first VMA at or above 'addr' */
@@ -1064,15 +1064,19 @@ static int pagemap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
 		/* Addresses in the VMA. */
-		for (; addr < min(end, vma->vm_end); addr += PAGE_SIZE) {
+		orig_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+		for (; addr < min(end, vma->vm_end); pte++, addr += PAGE_SIZE) {
 			pagemap_entry_t pme;
-			pte = pte_offset_map(pmd, addr);
 			pte_to_pagemap_entry(&pme, pm, vma, addr, *pte);
-			pte_unmap(pte);
 			err = add_to_pagemap(addr, &pme, pm);
 			if (err)
-				return err;
+				break;
+		pte_unmap_unlock(orig_pte, ptl);
+		if (err)
+			return err;
 		if (addr == end)

More information about the kernel-team mailing list