[SRU][kinetic:master][focal:hwe-5.13-next/hwe-5.15-next][PATCH] UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files

Alexander Mikhalitsyn alexander.mikhalitsyn at virtuozzo.com
Wed Aug 3 12:19:10 UTC 2022


From: Alexander Mikhalitsyn <alexander at mihalicyn.com>

BugLink: https://bugs.launchpad.net/bugs/1967924
BugLink: https://bugs.launchpad.net/bugs/1857257

The hack was introduced in ("UBUNTU: SAUCE: overlayfs: allow with
shiftfs as underlay") and it broke checkpoint/restore of docker
contains:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857257

The following script can be used to trigger the issue:
  #!/bin/bash

  cat > test.py << EOF
  import sys

  f = open("/proc/self/maps")

  for l in f.readlines():
    if "python" not in l:
      continue
    print(l)
    s = l.split()
    start, end = s[0].split("-")
    fname = s[-1]
    print(start, end, fname)
    break
  else:
    sys.exit(1)

  test_file1 = open(fname)
  test_file2 = open("/proc/self/map_files/%s-%s" % (start, end))

  fdinfo1 = open("/proc/self/fdinfo/%d" % test_file1.fileno()).read()
  fdinfo2 = open("/proc/self/fdinfo/%d" % test_file2.fileno()).read()

  if fdinfo1 != fdinfo2:
    print("FAIL")
    print(test_file1)
    print(fdinfo1)
    print(test_file2)
    print(fdinfo2)
    sys.exit(1)
  print("PASS")
  EOF
  sudo docker run -it --privileged --rm -v `pwd`:/mnt python python /mnt/test.py

Thanks to Andrei Vagin for the reproducer and investigation of this problem.

Cc: Andrei Vagin <avagin at gmail.com>
Cc: Adrian Reber <areber at redhat.com>
Cc: Christian Brauner <christian.brauner at ubuntu.com>
Cc: Stefan Bader <stefan.bader at canonical.com>
Cc: Kleber Sacilotto de Souza <kleber.souza at canonical.com>

Fixes: d24b8a5 ("UBUNTU: SAUCE: overlayfs: allow with shiftfs as underlay")
Signed-off-by: Alexander Mikhalitsyn <alexander at mihalicyn.com>
Acked-by: Seth Forshee <seth.forshee at canonical.com>
Acked-by: Stefan Bader <stefan.bader at canonical.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski at canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg at canonical.com>
[small refactoring to add the dependency of AUFS for vma->vm_prfile]
Signed-off-by: Andrea Righi <andrea.righi at canonical.com>
Signed-off-by: Paolo Pisati <paolo.pisati at canonical.com>

UBUNTU: SAUCE: overlayfs: prevent dereferencing struct file in ovl_vm_prfile_set()

BugLink: https://bugs.launchpad.net/bugs/1973620

With the following commit we re-introduced a SAUCE patch that has been
dropped starting with 5.13:

 37e9bac9203b ("UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files")

However the forward-ported patch introduced a potential NULL pointer
dereference bug:

BUG: kernel NULL pointer dereference, address: 0000000000000008
[  447.039738] #PF: supervisor read access in kernel mode
[  447.040369] #PF: error_code(0x0000) - not-present page
[  447.041002] PGD 0 P4D 0
[  447.041325] Oops: 0000 [#1] SMP NOPTI
[  447.041798] CPU: 0 PID: 73766 Comm: sudo Not tainted 5.15.0-28-generic #29~20.04.1-Ubuntu
[  447.042800] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Ubuntu-1.8.2-1ubuntu1+esm1 04/01/2014
[  447.043979] RIP: 0010:aa_file_perm+0x3a/0x470
[  447.044565] Code: 54 53 48 83 ec 68 48 89 7d 80 89 4d 8c 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 63 05 01 0a 19 01 48 03 82 c0 00 00 00 <4c> 8b 68 08 f6 46 40 02 0f 85 d0 00 00 00 41 f6 45 40 02 0f 85 c5
[  447.046837] RSP: 0018:ffffaefe80a4bca8 EFLAGS: 00010246
[  447.047481] RAX: 0000000000000000 RBX: ffff96e4038abd01 RCX: 0000000000000004
[  447.048351] RDX: ffff96e4038abd00 RSI: ffff96e401215eb8 RDI: ffffffff9c22a2ac
[  447.049241] RBP: ffffaefe80a4bd38 R08: 0000000000000000 R09: 0000000000000000
[  447.050121] R10: 0000000000000000 R11: 0000000000000000 R12: ffff96e401215eb8
[  447.051040] R13: ffff96e4038abd00 R14: ffffffff9c22a2ac R15: 0000000000000004
[  447.051942] FS:  00007eff3c0f8c80(0000) GS:ffff96e45e400000(0000) knlGS:0000000000000000
[  447.052981] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  447.053696] CR2: 0000000000000008 CR3: 0000000002be2000 CR4: 00000000003506f0
[  447.054571] Call Trace:
[  447.054883]  <TASK>
[  447.055154]  ? unlock_page_memcg+0x2f/0x40
[  447.055668]  ? page_remove_rmap+0x4b/0x320
[  447.056180]  common_file_perm+0x72/0x170
[  447.056669]  apparmor_file_permission+0x1c/0x20
[  447.057237]  security_file_permission+0x30/0x1a0
[  447.057898]  rw_verify_area+0x35/0x60
[  447.058392]  vfs_read+0x6d/0x1a0
[  447.058842]  ksys_read+0xb1/0xe0
[  447.059276]  __x64_sys_read+0x1a/0x20
[  447.059732]  do_syscall_64+0x5c/0xc0
[  447.060183]  ? __set_current_blocked+0x3b/0x60
[  447.060738]  ? exit_to_user_mode_prepare+0x3d/0x1c0
[  447.061434]  ? syscall_exit_to_user_mode+0x27/0x50
[  447.062099]  ? do_syscall_64+0x69/0xc0
[  447.062603]  ? irqentry_exit_to_user_mode+0x9/0x20
[  447.063210]  ? irqentry_exit+0x19/0x30
[  447.063678]  ? exc_page_fault+0x89/0x160
[  447.064165]  ? asm_exc_page_fault+0x8/0x30
[  447.064675]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  447.065298] RIP: 0033:0x7eff3c2cb002

This panic happens only when AUFS is enabled (that is required to
"activates" this feature).

This bug happens because we don't need to decrement anymore the refcount
for the previous vm_file value in ovl_vm_prfile_set(). So make sure to
drop the offending fput() to prevent the kernel panic above.

Signed-off-by: Andrea Righi <andrea.righi at canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza at canonical.com>
Acked-by: Stefan Bader <stefan.bader at canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza at canonical.com>

UBUNTU: SAUCE: overlayfs: remove CONFIG_AUFS_FS dependency

BugLink: https://bugs.launchpad.net/bugs/1967924
BugLink: https://bugs.launchpad.net/bugs/1857257

BugLink: https://bugs.launchpad.net/bugs/1967924

Right now we have a fix:
b07bc17b ("UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files")
in master branch of Jammy kernel, but only formaly. Because these kernels compiled without
CONFIG_AUFS_FS set, so this fix just disabled. There is no need to make it dependent on
CONFIG_AUFS_FS option, because in all cases we have mm/prfile.c compiled-in.

Cc: Andrei Vagin <avagin at gmail.com>
Cc: Adrian Reber <areber at redhat.com>
Cc: Christian Brauner <christian at brauner.io>
Cc: Stefan Bader <stefan.bader at canonical.com>
Cc: Kleber Sacilotto de Souza <kleber.souza at canonical.com>
Cc: Andrea Righi <andrea.righi at canonical.com>

Fixes: b07bc17b ("UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files")

Fixups for ("UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files"):
- ("UBUNTU: SAUCE: overlayfs: remove CONFIG_AUFS_FS dependency")
- ("UBUNTU: SAUCE: overlayfs: prevent dereferencing struct file in ovl_vm_prfile_set()")
was squashed. It should make ubuntu kernel maintainers work a little bit easier :-)

I believe that we need to make something similar for jammy/linux too.

Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn at virtuozzo.com>
---
 fs/overlayfs/file.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index d0c96ca0202e..7024771dcae8 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -490,6 +490,32 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 	return ret;
 }
 
+/*
+ * In map_files_get_link() (fs/proc/base.c)
+ * we need to determine correct path from overlayfs.
+ * But real_mount(realfile->f_path.mnt) may be not
+ * equal to real_mount(file->f_path.mnt). In such case
+ * fdinfo of the same file which was opened from
+ * /proc/<pid>/map_files/... and "usual" path
+ * will show different mnt_id.
+ *
+ * We solve issue like in aufs by using additional
+ * field on struct vm_area_struct called "vm_prfile"
+ * which is used only for fdinfo/"printing" needs.
+ *
+ * See also mm/prfile.c
+ */
+static void ovl_vm_prfile_set(struct vm_area_struct *vma,
+			      struct file *file)
+{
+	get_file(file);
+	vma->vm_prfile = file;
+#ifndef CONFIG_MMU
+	get_file(file);
+	vma->vm_region->vm_prfile = file;
+#endif
+}
+
 static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct file *realfile = file->private_data;
@@ -507,6 +533,10 @@ static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
 	old_cred = ovl_override_creds(file_inode(file)->i_sb);
 	ret = call_mmap(vma->vm_file, vma);
 	revert_creds(old_cred);
+
+	if (!ret)
+		ovl_vm_prfile_set(vma, file);
+
 	ovl_file_accessed(file);
 
 	return ret;
-- 
2.36.1




More information about the kernel-team mailing list