[RFC] [PATCH] notify init daemon when children are reparented to it

Jim Lieb jim.lieb at canonical.com
Tue Dec 16 22:52:17 UTC 2008


On Tuesday 16 December 2008 14:09:10 Scott James Remnant wrote:
> On Tue, 2008-12-16 at 12:38 -0800, Jim Lieb wrote:
> > This is a good idea as we talked about it at UDS.  I have a few
> > problems with the implementation:
>
> No worries, this again is exactly the kind of feedback I'm after.
>
> > 1. Adding anything to task_struct consumes memory given the number of
> >   these things floating around in a system, especially since this
> > overhead only applies to a small number of procs.
>
> Yeah, it ends up only really applying to the single init daemon
> process. :-/
>
> > 2. Signals are ugly at many levels.  The handler is heavy, asynchronous,
> >   and information content free.  Overlaying stuff on siginfo is limited.
>
> Agree, but I found that I had to do it as a signal to get the guaranteed
> behaviour mix with SIGCHLD (see my reply to Andy).  Doing it in a
> different band meant there were fundamental race conditions between
> delivery of the adoption notification, and delivery of the child death
> notification.
add an event for the child death too.  parent died -> child "adopted".
Even with out of order (which I doubt here) you have ppid and pid to sort
it out.
>
> > 3. Hijacking an RT signal interferes with glibc.  The pthreads code has
> >    already done so btw.
>
> glibc already takes care of incrementing SIGRTMIN appropriately; one of
> the main reasons that the prctl() takes a signal number, rather than
> hardcoding it in the kernel.
yea, but that is a compile time for the app event, not the kernel.  This is
gcc+glibc+pthreads voodoo.
>
> > 4. You only get one event.
>
> How do you mean?
The patch slams the door behind it, preventing other, later events which
might be interesting given the goofiness of so many daemon services.
>
> > 5. You can't tell if the kernel has the patch or not.
>
> The prctl() will return EINVAL if not patched.
>
> > Use netlink to send upstart messages on any transition you want.
> > You can now stuff anything you want including the whole fork/exec/setsid
> > chain.  Netlink has the advantage of asynchronous notify but synchronous
> > reception.  You also don't lose events where you could with back-to-back
> > signals (unless you constructed some threadsafe queue).  Netlink may be
> > out of fashion with some folks but it would not be as bad as an RT signal
> > overlay and task_struct bloat.  You also have the advantage of either
> > restricting the netlink to only send to pid 1 or allow other procs to
> > listen in as well for debugging/auditing.
>
> The primary issue here is that the netlink socket would have to be used
> for SIGCHLD as well; which makes semantics mixed with the delivery of
> SIGCHLD (and behaviour when not delivered), and when exactly a process
> is reaped.
See above.  the netlink packet is not limited to a few fields like siginfo.
Both events occur when parent dies anyway.  In fact, you could dispense
with the SIGCHLD signal overhead by dumping its info into the netlink
payload as well (if you keep things like rusage bits).
>
> Netlink can also overflow, which means important events get lost.  At
> least signals can't do that.
The RT signal queue has limits to.  the netlink runs out of skb's which is a
bad thing for everybody.  udev is already doing this ok btw.  Also, you
can trim traffic by disabling it for things you aren't interested in such as
gdm/kdm and children, one shot services etc.  You are still in upstart
code (pre-exec) when you decide this so this is easy.
>
> Also adding yet another delivery mechanism for this kind of thing seemed
> like a very large kernel patch, and more difficult to get upstream?
I would guess cleaner because the task_struct change doesn't trigger
the bloat police and if the netlink code is in a callable function, the 
patches into your notification points become small and digestible.
There are no ABI changes unlike the new args for prctl (an ioctl for
procs...) to piss off the glibc people.  Yea, the /proc attribute code is
in sacred ground but it *is* for a proc entry so that is easier to do as
well.
>
> > As for detecting whether the kernel is patched or not, simply open/read
> > /proc/self/init_watch on yourself.  If open returns ENOENT, the patch is
> > not in this kernel and you fall back to ptrace.  If it is there and != 0,
> > you are golden.  You could even have the default == 0 and have upstart
> > set it if it is new enough to know about netlink.  Otherwise, there is no
> > overhead at all, e.g. "if(unlikely(current->init_watch) ... "
>
> Actually, with Upstart I'd simply exit() with an error.  I decided that
> it was acceptable for Upstart to require the latest kernel, glibc, etc.
> on the basis that it's packaged by distributors who know what minimum
> versions they'll have.
>
> (And if you're playing with Upstart, you're inherently doing things with
> udev, devicekit, etc. that also require the latest versions anyway.)
>
> Scott
Yea, but going splat up against the wall is bad form, esp for /sbin/init.
Much nicer to syslog a "Upgrade your kernel and I'll be nicer to your
system" reminder while still running just fine.  You have the ptrace hook
now which is back compatible for old kernels which makes upstream
and other distros happier.

-- 
Jim Lieb
Ubuntu Kernel Team
Canonical Ltd.




More information about the kernel-team mailing list