NACK/Cmnt: [SRU][kinetic:master][focal:hwe-5.13-next/hwe-5.15-next][PATCH] UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files

Stefan Bader stefan.bader at canonical.com
Thu Aug 4 07:53:35 UTC 2022


On 03.08.22 18:52, Alexander Mikhalitsyn wrote:
> On Wed, 3 Aug 2022 16:48:17 +0200
> Stefan Bader <stefan.bader at canonical.com> wrote:
> 
>> On 03.08.22 14:19, Alexander Mikhalitsyn wrote:
>>> From: Alexander Mikhalitsyn <alexander at mihalicyn.com>
>>>
>>> BugLink: https://bugs.launchpad.net/bugs/1967924
>>> BugLink: https://bugs.launchpad.net/bugs/1857257
>>>
>>> The hack was introduced in ("UBUNTU: SAUCE: overlayfs: allow with
>>> shiftfs as underlay") and it broke checkpoint/restore of docker
>>> contains:
>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857257
>>>
>>> The following script can be used to trigger the issue:
>>>     #!/bin/bash
>>>
>>>     cat > test.py << EOF
>>>     import sys
>>>
>>>     f = open("/proc/self/maps")
>>>
>>>     for l in f.readlines():
>>>       if "python" not in l:
>>>         continue
>>>       print(l)
>>>       s = l.split()
>>>       start, end = s[0].split("-")
>>>       fname = s[-1]
>>>       print(start, end, fname)
>>>       break
>>>     else:
>>>       sys.exit(1)
>>>
>>>     test_file1 = open(fname)
>>>     test_file2 = open("/proc/self/map_files/%s-%s" % (start, end))
>>>
>>>     fdinfo1 = open("/proc/self/fdinfo/%d" % test_file1.fileno()).read()
>>>     fdinfo2 = open("/proc/self/fdinfo/%d" % test_file2.fileno()).read()
>>>
>>>     if fdinfo1 != fdinfo2:
>>>       print("FAIL")
>>>       print(test_file1)
>>>       print(fdinfo1)
>>>       print(test_file2)
>>>       print(fdinfo2)
>>>       sys.exit(1)
>>>     print("PASS")
>>>     EOF
>>>     sudo docker run -it --privileged --rm -v `pwd`:/mnt python python /mnt/test.py
>>>
>>> Thanks to Andrei Vagin for the reproducer and investigation of this problem.
>>>
>>> Cc: Andrei Vagin <avagin at gmail.com>
>>> Cc: Adrian Reber <areber at redhat.com>
>>> Cc: Christian Brauner <christian.brauner at ubuntu.com>
>>> Cc: Stefan Bader <stefan.bader at canonical.com>
>>> Cc: Kleber Sacilotto de Souza <kleber.souza at canonical.com>
>>>
>>> Fixes: d24b8a5 ("UBUNTU: SAUCE: overlayfs: allow with shiftfs as underlay")
>>> Signed-off-by: Alexander Mikhalitsyn <alexander at mihalicyn.com>
>>> Acked-by: Seth Forshee <seth.forshee at canonical.com>
>>> Acked-by: Stefan Bader <stefan.bader at canonical.com>
>>> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski at canonical.com>
>>> Signed-off-by: Kelsey Skunberg <kelsey.skunberg at canonical.com>
>>> [small refactoring to add the dependency of AUFS for vma->vm_prfile]
>>> Signed-off-by: Andrea Righi <andrea.righi at canonical.com>
>>> Signed-off-by: Paolo Pisati <paolo.pisati at canonical.com>
>>>
>>> UBUNTU: SAUCE: overlayfs: prevent dereferencing struct file in ovl_vm_prfile_set()
>>>
>>> BugLink: https://bugs.launchpad.net/bugs/1973620
>>>
>>> With the following commit we re-introduced a SAUCE patch that has been
>>> dropped starting with 5.13:
>>>
>>>    37e9bac9203b ("UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files")
>>>
>>> However the forward-ported patch introduced a potential NULL pointer
>>> dereference bug:
>>>
>>> BUG: kernel NULL pointer dereference, address: 0000000000000008
>>> [  447.039738] #PF: supervisor read access in kernel mode
>>> [  447.040369] #PF: error_code(0x0000) - not-present page
>>> [  447.041002] PGD 0 P4D 0
>>> [  447.041325] Oops: 0000 [#1] SMP NOPTI
>>> [  447.041798] CPU: 0 PID: 73766 Comm: sudo Not tainted 5.15.0-28-generic #29~20.04.1-Ubuntu
>>> [  447.042800] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Ubuntu-1.8.2-1ubuntu1+esm1 04/01/2014
>>> [  447.043979] RIP: 0010:aa_file_perm+0x3a/0x470
>>> [  447.044565] Code: 54 53 48 83 ec 68 48 89 7d 80 89 4d 8c 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 63 05 01 0a 19 01 48 03 82 c0 00 00 00 <4c> 8b 68 08 f6 46 40 02 0f 85 d0 00 00 00 41 f6 45 40 02 0f 85 c5
>>> [  447.046837] RSP: 0018:ffffaefe80a4bca8 EFLAGS: 00010246
>>> [  447.047481] RAX: 0000000000000000 RBX: ffff96e4038abd01 RCX: 0000000000000004
>>> [  447.048351] RDX: ffff96e4038abd00 RSI: ffff96e401215eb8 RDI: ffffffff9c22a2ac
>>> [  447.049241] RBP: ffffaefe80a4bd38 R08: 0000000000000000 R09: 0000000000000000
>>> [  447.050121] R10: 0000000000000000 R11: 0000000000000000 R12: ffff96e401215eb8
>>> [  447.051040] R13: ffff96e4038abd00 R14: ffffffff9c22a2ac R15: 0000000000000004
>>> [  447.051942] FS:  00007eff3c0f8c80(0000) GS:ffff96e45e400000(0000) knlGS:0000000000000000
>>> [  447.052981] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  447.053696] CR2: 0000000000000008 CR3: 0000000002be2000 CR4: 00000000003506f0
>>> [  447.054571] Call Trace:
>>> [  447.054883]  <TASK>
>>> [  447.055154]  ? unlock_page_memcg+0x2f/0x40
>>> [  447.055668]  ? page_remove_rmap+0x4b/0x320
>>> [  447.056180]  common_file_perm+0x72/0x170
>>> [  447.056669]  apparmor_file_permission+0x1c/0x20
>>> [  447.057237]  security_file_permission+0x30/0x1a0
>>> [  447.057898]  rw_verify_area+0x35/0x60
>>> [  447.058392]  vfs_read+0x6d/0x1a0
>>> [  447.058842]  ksys_read+0xb1/0xe0
>>> [  447.059276]  __x64_sys_read+0x1a/0x20
>>> [  447.059732]  do_syscall_64+0x5c/0xc0
>>> [  447.060183]  ? __set_current_blocked+0x3b/0x60
>>> [  447.060738]  ? exit_to_user_mode_prepare+0x3d/0x1c0
>>> [  447.061434]  ? syscall_exit_to_user_mode+0x27/0x50
>>> [  447.062099]  ? do_syscall_64+0x69/0xc0
>>> [  447.062603]  ? irqentry_exit_to_user_mode+0x9/0x20
>>> [  447.063210]  ? irqentry_exit+0x19/0x30
>>> [  447.063678]  ? exc_page_fault+0x89/0x160
>>> [  447.064165]  ? asm_exc_page_fault+0x8/0x30
>>> [  447.064675]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>>> [  447.065298] RIP: 0033:0x7eff3c2cb002
>>>
>>> This panic happens only when AUFS is enabled (that is required to
>>> "activates" this feature).
>>>
>>> This bug happens because we don't need to decrement anymore the refcount
>>> for the previous vm_file value in ovl_vm_prfile_set(). So make sure to
>>> drop the offending fput() to prevent the kernel panic above.
>>>
>>> Signed-off-by: Andrea Righi <andrea.righi at canonical.com>
>>> Acked-by: Kleber Sacilotto de Souza <kleber.souza at canonical.com>
>>> Acked-by: Stefan Bader <stefan.bader at canonical.com>
>>> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza at canonical.com>
>>>
>>> UBUNTU: SAUCE: overlayfs: remove CONFIG_AUFS_FS dependency
>>>
>>> BugLink: https://bugs.launchpad.net/bugs/1967924
>>> BugLink: https://bugs.launchpad.net/bugs/1857257
>>>
>>> BugLink: https://bugs.launchpad.net/bugs/1967924
>>>
>>> Right now we have a fix:
>>> b07bc17b ("UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files")
>>> in master branch of Jammy kernel, but only formaly. Because these kernels compiled without
>>> CONFIG_AUFS_FS set, so this fix just disabled. There is no need to make it dependent on
>>> CONFIG_AUFS_FS option, because in all cases we have mm/prfile.c compiled-in.
>>>
>>> Cc: Andrei Vagin <avagin at gmail.com>
>>> Cc: Adrian Reber <areber at redhat.com>
>>> Cc: Christian Brauner <christian at brauner.io>
>>> Cc: Stefan Bader <stefan.bader at canonical.com>
>>> Cc: Kleber Sacilotto de Souza <kleber.souza at canonical.com>
>>> Cc: Andrea Righi <andrea.righi at canonical.com>
>>>
>>> Fixes: b07bc17b ("UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files")
>>>
>>> Fixups for ("UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files"):
>>> - ("UBUNTU: SAUCE: overlayfs: remove CONFIG_AUFS_FS dependency")
>>> - ("UBUNTU: SAUCE: overlayfs: prevent dereferencing struct file in ovl_vm_prfile_set()")
>>> was squashed. It should make ubuntu kernel maintainers work a little bit easier :-)
>>>
>>> I believe that we need to make something similar for jammy/linux too.
>>>
>>> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn at virtuozzo.com>
>>
>> For one this commit message seems to get out of bounds and includes multiple
>> signed-off-by areas completely. I am getting lost there. Second hwe-5.13 is EOL
> 
> Got it, let's omit hwe-5.13.
> 
> Understand your point, it's because it's a squash of a few fixups. From my point of view
> we can just remove all commit descriptions from fixup patches and keep only the original one.
> 
>> and neither should hwe-5.15 get patched if something should be done directly to
>> jammy. And lastly master/-next are branch names or part of it and are irrelevant
>> for submissions. This should be kinetic/jammy nothing else.
> 
> Yep, I understand from your and Andrea explanations that hwe-5.15 branch is inherited from the jammy. But for some reason at this point of time hwe-5.15 branch doesn't contain
> ("UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files")
> patch at all. But jammy kernel (which is the parent for this branch AFAIU) has this patch!
> 
> I'm not sure how to handle it properly. All that I want is to be fully clear about which patches should be applied to which kernels to prevent any possible degradations or problems.
> 
> Sorry for inconvenience and wrong patch headers :)

No worries. Just things need to be arranged in a way that is sensible when 
looking at it tomorrow. I think all the pieces required technically are there 
and Andrea did volunteer to try getting this into the formal form we need. That 
would include all parts missing in jammy so hwe-5.15 gets fixed up naturally. IT 
could be a little time depending on other tasks.

Thanks,
Stefan
> 
> Regards,
> Alex
> 
>>
>> -Stefan
>>
>> PS: Not sure all on cc: really are interested in seeing this. And some never
>> will because the are no longer working where they used to work.
>>
>>> ---
>>>    fs/overlayfs/file.c | 30 ++++++++++++++++++++++++++++++
>>>    1 file changed, 30 insertions(+)
>>>
>>> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
>>> index d0c96ca0202e..7024771dcae8 100644
>>> --- a/fs/overlayfs/file.c
>>> +++ b/fs/overlayfs/file.c
>>> @@ -490,6 +490,32 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
>>>    	return ret;
>>>    }
>>>    
>>> +/*
>>> + * In map_files_get_link() (fs/proc/base.c)
>>> + * we need to determine correct path from overlayfs.
>>> + * But real_mount(realfile->f_path.mnt) may be not
>>> + * equal to real_mount(file->f_path.mnt). In such case
>>> + * fdinfo of the same file which was opened from
>>> + * /proc/<pid>/map_files/... and "usual" path
>>> + * will show different mnt_id.
>>> + *
>>> + * We solve issue like in aufs by using additional
>>> + * field on struct vm_area_struct called "vm_prfile"
>>> + * which is used only for fdinfo/"printing" needs.
>>> + *
>>> + * See also mm/prfile.c
>>> + */
>>> +static void ovl_vm_prfile_set(struct vm_area_struct *vma,
>>> +			      struct file *file)
>>> +{
>>> +	get_file(file);
>>> +	vma->vm_prfile = file;
>>> +#ifndef CONFIG_MMU
>>> +	get_file(file);
>>> +	vma->vm_region->vm_prfile = file;
>>> +#endif
>>> +}
>>> +
>>>    static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
>>>    {
>>>    	struct file *realfile = file->private_data;
>>> @@ -507,6 +533,10 @@ static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
>>>    	old_cred = ovl_override_creds(file_inode(file)->i_sb);
>>>    	ret = call_mmap(vma->vm_file, vma);
>>>    	revert_creds(old_cred);
>>> +
>>> +	if (!ret)
>>> +		ovl_vm_prfile_set(vma, file);
>>> +
>>>    	ovl_file_accessed(file);
>>>    
>>>    	return ret;
>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20220804/a93427da/attachment-0001.sig>


More information about the kernel-team mailing list