[PATCH 5/5] userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set

Tyler Hicks tyhicks at canonical.com
Fri Jan 25 02:01:26 UTC 2019


From: Andrea Arcangeli <aarcange at redhat.com>

Set the page dirty if VM_WRITE is not set because in such case the pte
won't be marked dirty and the page would be reclaimed without writepage
(i.e.  swapout in the shmem case).

This was found by source review.  Most apps (certainly including QEMU)
only use UFFDIO_COPY on PROT_READ|PROT_WRITE mappings or the app can't
modify the memory in the first place.  This is for correctness and it
could help the non cooperative use case to avoid unexpected data loss.

Link: http://lkml.kernel.org/r/20181126173452.26955-6-aarcange@redhat.com
Reviewed-by: Hugh Dickins <hughd at google.com>
Cc: stable at vger.kernel.org
Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
Reported-by: Hugh Dickins <hughd at google.com>
Signed-off-by: Andrea Arcangeli <aarcange at redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert at redhat.com>
Cc: Jann Horn <jannh at google.com>
Cc: Mike Kravetz <mike.kravetz at oracle.com>
Cc: Mike Rapoport <rppt at linux.ibm.com>
Cc: Peter Xu <peterx at redhat.com>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>

CVE-2018-18397

(cherry picked from commit dcf7fe9d89763a28e0f43975b422ff141fe79e43)
Signed-off-by: Tyler Hicks <tyhicks at canonical.com>
---
 mm/shmem.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/mm/shmem.c b/mm/shmem.c
index fb19d2e2fa11..67b7fa08036a 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2346,6 +2346,16 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
 	_dst_pte = mk_pte(page, dst_vma->vm_page_prot);
 	if (dst_vma->vm_flags & VM_WRITE)
 		_dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte));
+	else {
+		/*
+		 * We don't set the pte dirty if the vma has no
+		 * VM_WRITE permission, so mark the page dirty or it
+		 * could be freed from under us. We could do it
+		 * unconditionally before unlock_page(), but doing it
+		 * only if VM_WRITE is not set is faster.
+		 */
+		set_page_dirty(page);
+	}
 
 	dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
 
@@ -2379,6 +2389,7 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
 	return ret;
 out_release_uncharge_unlock:
 	pte_unmap_unlock(dst_pte, ptl);
+	ClearPageDirty(page);
 	delete_from_page_cache(page);
 out_release_uncharge:
 	mem_cgroup_cancel_charge(page, memcg, false);
-- 
2.7.4




More information about the kernel-team mailing list