[APPLIED] [ebiederm at xmission.com: [REVIEW][PATCH 2/3] fork: Allow CLONE_PARENT after setns(CLONE_NEWPID)]
Andy Whitcroft
apw at canonical.com
Mon Dec 2 11:15:40 UTC 2013
On Fri, Nov 29, 2013 at 04:07:09PM -0600, Serge Hallyn wrote:
> Hi,
>
> this patch is in
> https://git.kernel.org/cgit/linux/kernel/git/ebiederm/user-namespace.git/log/?h=for-next
> (and I assume in linux-next). It's one of two patches needed in trusty
> for containers. This one enables lxc-attach to work in trusty.
>
> ----- Forwarded message from "Eric W. Biederman" <ebiederm at xmission.com> -----
>
> Date: Tue, 26 Nov 2013 16:16:57 -0800
> From: "Eric W. Biederman" <ebiederm at xmission.com>
> To: "Serge E. Hallyn" <serge at hallyn.com>
> Cc: Gao feng <gaofeng at cn.fujitsu.com>, Containers <containers at lists.linux-foundation.org>,
> linux-fsdevel at vger.kernel.org, Aditya Kali <adityakali at google.com>, Oleg Nesterov
> <oleg at redhat.com>, Andy Lutomirski <luto at amacapital.net>
> Subject: [REVIEW][PATCH 2/3] fork: Allow CLONE_PARENT after setns(CLONE_NEWPID)
>
>
> Serge Hallyn <serge.hallyn at ubuntu.com> writes:
> > Hi Oleg,
> >
> > commit 40a0d32d1eaffe6aac7324ca92604b6b3977eb0e :
> > "fork: unify and tighten up CLONE_NEWUSER/CLONE_NEWPID checks"
> > breaks lxc-attach in 3.12. That code forks a child which does
> > setns() and then does a clone(CLONE_PARENT). That way the
> > grandchild can be in the right namespaces (which the child was
> > not) and be a child of the original task, which is the monitor.
> >
> > lxc-attach in 3.11 was working fine with no side effects that I
> > could see. Is there a real danger in allowing CLONE_PARENT
> > when current->nsproxy->pidns_for_children is not our pidns,
> > or was this done out of an "over-abundance of caution"? Can we
> > safely revert that new extra check?
>
> The two fundamental things I know we can not allow are:
> - A shared signal queue aka CLONE_THREAD. Because we compute the pid
> and uid of the signal when we place it in the queue.
>
> - Changing the pid and by extention pid_namespace of an existing
> process.
>
> >From a parents perspective there is nothing special about the pid
> namespace, to deny CLONE_PARENT, because the parent simply won't know or
> care.
>
> >From the childs perspective all that is special really are shared signal
> queues.
>
> User mode threading with CLONE_PARENT|CLONE_VM|CLONE_SIGHAND and tasks
> in different pid namespaces is almost certainly going to break because
> it is complicated. But shared signal handlers can look at per thread
> information to know which pid namespace a process is in, so I don't know
> of any reason not to support CLONE_PARENT|CLONE_VM|CLONE_SIGHAND threads
> at the kernel level. It would be absolutely stupid to implement but
> that is a different thing.
>
> So hmm.
>
> Because it can do no harm, and because it is a regression let's remove
> the CLONE_PARENT check and send it stable.
>
> Cc: stable at vger.kernel.org
> Acked-by: Oleg Nesterov <oleg at redhat.com>
> Acked-by: Andy Lutomirski <luto at amacapital.net>
> Acked-by: Serge E. Hallyn <serge.hallyn at ubuntu.com>
> Signed-off-by: "Eric W. Biederman" <ebiederm at xmission.com>
> ---
> kernel/fork.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 728d5be9548c..f82fa2ee7581 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1171,7 +1171,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> * do not allow it to share a thread group or signal handlers or
> * parent with the forking task.
> */
> - if (clone_flags & (CLONE_SIGHAND | CLONE_PARENT)) {
> + if (clone_flags & CLONE_SIGHAND) {
> if ((clone_flags & (CLONE_NEWUSER | CLONE_NEWPID)) ||
> (task_active_pid_ns(current) !=
> current->nsproxy->pid_ns_for_children))
> --
This patch is the upstream fix for which we are already carrying a
revert. I will unrevert that and replace same with this patch.
Applied to Trusty unstable.
-apw
More information about the kernel-team
mailing list