[RFC] [PATCH] notify init daemon when children are reparented to it
Scott James Remnant
scott at canonical.com
Wed Dec 17 13:32:22 UTC 2008
On Tue, 2008-12-16 at 14:52 -0800, Jim Lieb wrote:
> On Tuesday 16 December 2008 14:09:10 Scott James Remnant wrote:
> > Agree, but I found that I had to do it as a signal to get the guaranteed
> > behaviour mix with SIGCHLD (see my reply to Andy). Doing it in a
> > different band meant there were fundamental race conditions between
> > delivery of the adoption notification, and delivery of the child death
> > notification.
> add an event for the child death too. parent died -> child "adopted".
> Even with out of order (which I doubt here) you have ppid and pid to sort
> it out.
>
Right.
We need two basic events from the kernel.
1) an event to say that a child process of ours has died, which tells
us we need to loop on waitid(). We have this today with SIGCHLD.
2) an event to say that we've adopted a new child process (we know
about children we spawned ourselves)
The reason for implementing the second as a signal in the patch was
because we already have the first as a signal; we need both to be
delivered by the same mechanism to avoid races.
If we had the died event as a signal, but the adopted event over a
socket, the timing differences between the two could mean that there's
no way to guarantee that if we have one we will also have the other
pending.
And that's important.
> >
> > > 3. Hijacking an RT signal interferes with glibc. The pthreads code has
> > > already done so btw.
> >
> > glibc already takes care of incrementing SIGRTMIN appropriately; one of
> > the main reasons that the prctl() takes a signal number, rather than
> > hardcoding it in the kernel.
> yea, but that is a compile time for the app event, not the kernel. This is
> gcc+glibc+pthreads voodoo.
>
I'm not sure what you mean here, or whether you're sure what I mean ;)
The kernel defines < 32 standard signals of its own. It also permits at
least a further 8 "real-time" signals. The range of real-time signals
is that between the SIGRTMIN and SIGRTMAX macros, and we actually have
32 of them on Linux.
glibc does eat a couple of these for its threading handling. However it
also redefines SIGRTMIN to be the first real-time signal *not* used by
glibc.
Thus for an application, using SIGRTMIN..SIGRTMAX is entirely safe.
I don't hardcode assume any particular signal in the patch, instead its
passed in the prctl() to the kernel, thus init can have the code:
prctl (PR_SET_ADOPTSIG, SIGRTMIN);
And the signal used will be the first real-time signal not used by
glibc.
> > > Use netlink to send upstart messages on any transition you want.
> > > You can now stuff anything you want including the whole fork/exec/setsid
> > > chain. Netlink has the advantage of asynchronous notify but synchronous
> > > reception. You also don't lose events where you could with back-to-back
> > > signals (unless you constructed some threadsafe queue). Netlink may be
> > > out of fashion with some folks but it would not be as bad as an RT signal
> > > overlay and task_struct bloat. You also have the advantage of either
> > > restricting the netlink to only send to pid 1 or allow other procs to
> > > listen in as well for debugging/auditing.
> >
> > The primary issue here is that the netlink socket would have to be used
> > for SIGCHLD as well; which makes semantics mixed with the delivery of
> > SIGCHLD (and behaviour when not delivered), and when exactly a process
> > is reaped.
> See above. the netlink packet is not limited to a few fields like siginfo.
> Both events occur when parent dies anyway. In fact, you could dispense
> with the SIGCHLD signal overhead by dumping its info into the netlink
> payload as well (if you keep things like rusage bits).
> >
> > Netlink can also overflow, which means important events get lost. At
> > least signals can't do that.
> The RT signal queue has limits to. the netlink runs out of skb's which is a
> bad thing for everybody. udev is already doing this ok btw. Also, you
> can trim traffic by disabling it for things you aren't interested in such as
> gdm/kdm and children, one shot services etc. You are still in upstart
> code (pre-exec) when you decide this so this is easy.
>
My concern with netlink, and any other fd-based protocol, is that we'd
end up with non-reliable delivery of SIGCHLD (which would need to be
passed over the same protocol).
udev at least has the ability to re-request the netlink events, because
they match data in /sys -- we wouldn't have the same ability here.
(It only occurred to me last night *why* the standard signals have the
behaviour they do - it means they are absolutely reliable, since you can
pre-reserve the space in memory needed to mark them pending).
Scott
--
Scott James Remnant
scott at canonical.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20081217/27823fc4/attachment.sig>
More information about the kernel-team
mailing list