[RFC] [PATCH] notify init daemon when children are reparented to it

Scott James Remnant scott at canonical.com
Wed Dec 17 13:32:22 UTC 2008

On Tue, 2008-12-16 at 14:52 -0800, Jim Lieb wrote:

> On Tuesday 16 December 2008 14:09:10 Scott James Remnant wrote:
> > Agree, but I found that I had to do it as a signal to get the guaranteed
> > behaviour mix with SIGCHLD (see my reply to Andy).  Doing it in a
> > different band meant there were fundamental race conditions between
> > delivery of the adoption notification, and delivery of the child death
> > notification.
> add an event for the child death too.  parent died -> child "adopted".
> Even with out of order (which I doubt here) you have ppid and pid to sort
> it out.

We need two basic events from the kernel.

 1) an event to say that a child process of ours has died, which tells
    us we need to loop on waitid().  We have this today with SIGCHLD.

 2) an event to say that we've adopted a new child process (we know
    about children we spawned ourselves)

The reason for implementing the second as a signal in the patch was
because we already have the first as a signal; we need both to be
delivered by the same mechanism to avoid races.

If we had the died event as a signal, but the adopted event over a
socket, the timing differences between the two could mean that there's
no way to guarantee that if we have one we will also have the other

And that's important.

> >
> > > 3. Hijacking an RT signal interferes with glibc.  The pthreads code has
> > >    already done so btw.
> >
> > glibc already takes care of incrementing SIGRTMIN appropriately; one of
> > the main reasons that the prctl() takes a signal number, rather than
> > hardcoding it in the kernel.
> yea, but that is a compile time for the app event, not the kernel.  This is
> gcc+glibc+pthreads voodoo.
I'm not sure what you mean here, or whether you're sure what I mean ;)

The kernel defines < 32 standard signals of its own.  It also permits at
least a further 8 "real-time" signals.  The range of real-time signals
is that between the SIGRTMIN and SIGRTMAX macros, and we actually have
32 of them on Linux.

glibc does eat a couple of these for its threading handling.  However it
also redefines SIGRTMIN to be the first real-time signal *not* used by

Thus for an application, using SIGRTMIN..SIGRTMAX is entirely safe.

I don't hardcode assume any particular signal in the patch, instead its
passed in the prctl() to the kernel, thus init can have the code:


And the signal used will be the first real-time signal not used by

> > > Use netlink to send upstart messages on any transition you want.
> > > You can now stuff anything you want including the whole fork/exec/setsid
> > > chain.  Netlink has the advantage of asynchronous notify but synchronous
> > > reception.  You also don't lose events where you could with back-to-back
> > > signals (unless you constructed some threadsafe queue).  Netlink may be
> > > out of fashion with some folks but it would not be as bad as an RT signal
> > > overlay and task_struct bloat.  You also have the advantage of either
> > > restricting the netlink to only send to pid 1 or allow other procs to
> > > listen in as well for debugging/auditing.
> >
> > The primary issue here is that the netlink socket would have to be used
> > for SIGCHLD as well; which makes semantics mixed with the delivery of
> > SIGCHLD (and behaviour when not delivered), and when exactly a process
> > is reaped.
> See above.  the netlink packet is not limited to a few fields like siginfo.
> Both events occur when parent dies anyway.  In fact, you could dispense
> with the SIGCHLD signal overhead by dumping its info into the netlink
> payload as well (if you keep things like rusage bits).
> >
> > Netlink can also overflow, which means important events get lost.  At
> > least signals can't do that.
> The RT signal queue has limits to.  the netlink runs out of skb's which is a
> bad thing for everybody.  udev is already doing this ok btw.  Also, you
> can trim traffic by disabling it for things you aren't interested in such as
> gdm/kdm and children, one shot services etc.  You are still in upstart
> code (pre-exec) when you decide this so this is easy.
My concern with netlink, and any other fd-based protocol, is that we'd
end up with non-reliable delivery of SIGCHLD (which would need to be
passed over the same protocol).

udev at least has the ability to re-request the netlink events, because
they match data in /sys -- we wouldn't have the same ability here.

(It only occurred to me last night *why* the standard signals have the
behaviour they do - it means they are absolutely reliable, since you can
pre-reserve the space in memory needed to mark them pending).

Scott James Remnant
scott at canonical.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20081217/27823fc4/attachment.sig>

More information about the kernel-team mailing list