user namespace delta over 3.7

Colin Ian King colin.king at canonical.com
Mon Nov 19 14:49:48 UTC 2012


On 14/11/12 20:55, Serge Hallyn wrote:
> Quoting Tim Gardner (tim.gardner at canonical.com):
>> On 11/06/2012 09:36 AM, Serge Hallyn wrote:
>>> Hi,
>>>
>>> the core of user namespace code has landed upstream, however some more
>>> is needed to run full ubuntu containers in a user namespace.  Some of
>>> this will land in 3.8, but probably not all.  Eric's development tree
>>> is at http://git.kernel.org/?p=linux/kernel/git/ebiederm/user-namespace.git;a=summary
>>>
>>> I have pushed that tree on top of a recent raring tree at
>>> git://kernel.ubuntu.com/serge/quantal-userns.git in branch
>>> master.oct25.userns-v70.  It consists of 84 patches (including 5 just
>>> updating under debian/, one by me fix to account for ubuntu delta, and
>>> one not (yet) in Eric's tree to allow tmpfs mounts in a container),
>>> which I can git-email if desired.  The built kernel is in
>>> ppa:serge-hallyn/userns-natty and does allow me to boot a full ubuntu
>>> container in a user namespace - meaning every root owned process and
>>> file is actually owned by userid 100000 on the host and contained.
>>>
>>> I'm sending this now in the hopes that whatever bits don't land in
>>> 3.8 can be pushed onto the raring kernel.  Our goal this cycle is to
>>> support user namespaces, and next cycle to support completely
>>> unprivileged creation and starting of containers.
>>>
>>> -serge
>>>
>>
>> Serge - how about a pull request for a branch that has been rebased
>> on Raring master-next ? I took a quick stab at it and quickly ran
>> into uapi transition conflicts (I think).
>
> A successfully built kernel is at
> git://kernel.ubuntu.com/serge/quantal-userns.git (branch
> master-next.nov14.userns which should be the default).
>
> -serge
>

I've got some questions and/or observations about the following commits:

b3f4f523c8c20f2ca2ac031900f1a252d750ec1d
debian changes to build in ppa

	..this fiddles around with the skipabi, skipmodules to allow building 
in a PPA, but we should not pull that into the raring kernel.

1c428901dcae93832f13a01492539cb77fea6c85
net: Allow opening an af_unix socket

sock_open() has:
         /* file->f_flags??? */
         //file->f_flags = O_RDWR | (flags & O_NONBLOCK);

..the comment seems to be alluding to the fact we're not sure if we 
should be setting f->f_flags and that the code was put in during 
development (for testig?) and then commented out.  Anyhow, it's 
confusing and I'm now not sure what this is meant to be doing. Should 
this be removed?


8b16d00119a210ad9ed6d62fd0addb37f23c683b
fuse: Teach fuse how to handle the pid namespace.

convert_fuse_file_lock() has:

	fl->fl_pid = pid_vnr(find_pid_ns(ffl->pid, fc->pid_ns));

is it seems possible (but unlikely) for find_pid_ns() to return NULL 
which passes NULL into pid_vnr() which in turn passes NULL into
pid_nr_ns() which returns 0. Is a zero pid of fl->fl_pid valid?


05ee1be568bf321167fe93219aaac029a49b3d3a
devpts: Remove the devpty cleanup special case.

in ptmx_open():

	/* Find the devpts instance we are working with */
	mnt = devpts_mntget(filp);

getpts_mntget() can return ERR_PTR(-ENODEV), so mnt probably needs 
checking for this kind of unlikely failure case.


ebb713c0e5b2bbedcb4c94715a3973c0b084e723
devpts: Make the newinstance option historical

parse_mount_options():
	
case Opt_newinstance:  this is now a historical mount option and now 
silently does nothing.  Perhaps we should print some kind of warning or 
info message to indicate this just to warn users that this is now being 
ignored, however this is documented in the changes in 
Documentation/filesystems/devpts.txt so maybe this is totally unnecessary.


7350816a4fea9a9368f94a81cd44c3eb31fbbf3d
net: Push capable(CAP_NET_ADMIN) into the rtnl methods

Comment in this patch:
     "Later patches will remove the extra capable calls from methods
     that are safe for unprivilged users."

..are these later patches in this patch set, if so, which ones are they?


375cb2956423a172e017c1bfbc0c32d96250f41f
net: Don't export sysctls to unprivileged users

__ip_vs_lblc_init():

the following change added a '\' which looks like a typo:

-                       kfree(ipvs->lblc_ctl_table);
+                       kfree(ipvs->lblc_ctl_table);\
                 return -ENOMEM;


d4856b6128ec5735e654259407fd8573faec4b88
userns: Convert nfs and nfsd to use kuid/kgid where appropriate

find_guid():

When a gid is not found then a new one is added to the aces[] array. I 
don't see any bounds checking on this, so can this potentially fall off 
the end of the aces[] array at some point?


Colin



	











More information about the kernel-team mailing list