LD_PRELOAD work in progress

Stéphane Graber stgraber at ubuntu.com
Wed Feb 25 15:51:17 UTC 2015

On Wed, Feb 25, 2015 at 10:07:30AM +0100, Martin Pitt wrote:
> Oliver Grawert [2015-02-25  9:54 +0100]:
> > we also aim to make it easy for vendors to port their tree themselves.
> > how realistic is it to offer an easy to apply patchset for all possible
> > base versions between 3.4 and today to bring the overlayfs bits into
> > such a kernel ?
> It's a tangent, but just for the record: snappy uses systemd, and
> systemd requires Linux >= 3.8.
> Martin

Yeah and it does for the same reason we're thinking of moving LXC's base
supported version to 3.8 for LXC 2.0. 3.8 is basically when namespaces
started to not suck. That is, where support for unshare and setns was
introduced for all namespaces.

So FUSE should work, though I expect it to be pretty terrible from a
performance point of view, leading to quite a bit of power drain. That's
because any VFS operation will move from a single context switch per
syscall to at least two and quite possibly 3-4 (due to FUSE internal and
whether we need to do to know what's the right file to open).

"Easy to apply" is pretty easy to do so long as people don't come from
an heavily customized kernel (say an Android kernel) and stick to
something close to upstream.

The problem on our side is "easy to maintain" as we presumably don't
want people building systems relying on a completely outdated kernel
that's full of security holes, so that'd essentially boil down to
maintaining every kernel from 3.4 to 4.0 and that seems pretty
unrealistic to me.

For a 3.4 kernel, I expect we at the very least need to backport:
 - Current AppArmor (which doesn't fit any definition of small)
 - Recent Seccomp that supports BPF filtering (introduced around 3.5)
 - setns/unshare namespace patches from 3.8 (so we can easily run things
   in a separate PID namespace, mount namespace and offer a custom view of
   the network if needed, say for a firewall/IDS/VPN/... kind of use case)

That's assuming we make things slow and use FUSE for the overlay,
otherwise 3.19 overlayfs should be backported (we need a very recent one
as we need support for multiple branches which was merged just a few
weeks ago).

And then we've mentioned using user namespaces down the line to add an
extra layer of security for applications which need to think they're
root or need to own some privileged resources but do not require real
root. To support that, we'd have to backport a few hundred patches from
the 3.13 kernel.

We'd also need to review systemd and figure out what exactly it is it
needs from a 3.8 kernel. My suspicion is that it's mostly what I've
listed above already, but there may be a few more things that it needs
and should be included in that patchset.

I just want to make sure we won't be compromising the security of that
new platform because of Android's inability to keep up with current
supported kernel version.
It may be that the answer to that is to mark those devices as "tainted",
run without the new fancy security features and fallback on slow
alternative (like FUSE). Which would also mean that any such "tainted"
device wouldn't be able to access paid content as we can't possibly make
any security guarantee to the app author.

Stéphane Graber
Ubuntu developer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <https://lists.ubuntu.com/archives/snappy-devel/attachments/20150225/36e25865/attachment.pgp>

More information about the snappy-devel mailing list