[PATCH 0/2][SRU][UNSTABLE] UBUNTU: SAUCE: seccomp: backport SECCOMP_USER_NOTIF_FLAG_CONTINUE

Christian Brauner christian at brauner.io
Wed Oct 16 14:20:04 UTC 2019


From: Christian Brauner <christian.brauner at ubuntu.com>

Hey everyone,

BugLink: https://bugs.launchpad.net/bugs/1847744

Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4])
which enables a process (watchee) to retrieve an fd for its seccomp
filter. This fd can then be handed to another (usually more privileged)
process (watcher). The watcher will then be able to receive seccomp
messages about the syscalls having been performed by the watchee.

This feature is heavily used by LXD but currently with limited
useability which is why we urgently need this series.
For example, it is currently used to intercept mknod() syscalls in
unprivileged containers. The mknod() syscall can be easily filtered
based on dev_t. This allows us to only intercept a very specific subset
of mknod() syscalls. Furthermore, mknod() is not possible in user
namespaces toto coelo and so intercepting and denying syscalls that are
not in the whitelist on accident is not a big deal. The watchee won't
notice a difference.

In contrast to mknod(), a lot of other syscall we intercept (e.g.
setxattr(), and soon mount()) cannot be easily filtered like mknod()
because they have pointer arguments. Additionally, some of them might
actually succeed in user namespaces (e.g. setxattr() for all "user.*"
xattrs). Since we currently cannot tell seccomp to continue from a user
notifier we are stuck with performing all of the syscalls in lieu of the
container. This is a huge security liability since it is extremely
difficult to correctly assume all of the necessary privileges of the
calling task such that the syscall can be successfully emulated without
escaping other additional security restrictions (think missing CAP_MKNOD
for mknod(), or MS_NODEV on a filesystem etc.). This can
be solved by telling seccomp to resume the syscall.

Until we have backported this patch we are blocked on intercepting the
mount() syscall. It would be excellent if we could backport this patch.

I've also backported the selftests since they are worth running!
Please note that these patches are up for the v5.5 merge window and will
not be carried as Ubuntu specific patches indefinitely!

Thanks!
Christian

Christian Brauner (2):
  UBUNTU: SAUCE: seccomp: add SECCOMP_USER_NOTIF_FLAG_CONTINUE
  UBUNTU: SAUCE: seccomp: test SECCOMP_USER_NOTIF_FLAG_CONTINUE

 include/uapi/linux/seccomp.h                  |  29 +++++
 kernel/seccomp.c                              |  28 ++++-
 tools/testing/selftests/seccomp/seccomp_bpf.c | 107 ++++++++++++++++++
 3 files changed, 158 insertions(+), 6 deletions(-)

-- 
2.23.0




More information about the kernel-team mailing list