[RFC]: Enhanced User Sessions and session shutdown
Steve Langasek
steve.langasek at canonical.com
Sat Feb 2 07:53:16 UTC 2013
Hi James,
Picking up this topic from last week...
On Thu, Jan 24, 2013 at 02:18:14PM +0000, James Hunt wrote:
> > You write here:
> > With Upstart sessions, the Session Init will be terminated by:
> > * The desktop session sending an initctl command request to shutdown.
> > * The Session Init Instance will react to this, stop all the jobs and
> > then exit itself, closing the session in the process.
> > I don't think that's accurate. The decision about whether or not the system
> > can be shutdown/rebooted by the user needs to be made via system-level
> > policy; the shutdown request thus needs to be referred to a system-level
> > service before we shut down *any* jobs from the session init.
> I think there is some confusion here between session end and system
> shutdown. The 2 bullets you're referencing are under the heading 'Desktop
> Session Shutdown', not 'system shutdown'. To be clear the 'initctl
> shutdown' command is a request to shutdown the Session Init, but in
> certain circumstances (where --type is 'shutdown' or 'reboot'), that
> action may also trigger a system shutdown.
> I think the confusion has arisen as I didn't qualify that 'shutdown'
> refers to the *session* (not the system) in that first bullet. I've now
> clarified that in the spec.
That was clear to me. I'm saying that I don't think it's appropriate for
the shutdown or reboot request on the desktop to go by way of the session
init /at all/.
> > Only once the system init has been signalled to change runlevels via the
> > shutdown command, and the lightdm service has been asked to stop, should
> > the session init start generate a session-end event (in response to
> > being signalled itself).
> Right, but this does not handle the case where the user simply "logs out".
Correct. For the logout case, it's reasonable to handle this via the
session init, because that's a user-level operation. For reboot and
shutdown, since these are system-level operations, they should not be
handled via the session init.
So I don't think "reboot/shutdown" and "logout" are at all parallel, and I
don't think there should be any 'initctl shutdown --type=[reboot|logout]'
command.
> > So the first half of the architecture should remain entirely unchanged, we
> > only want to switch out some of the components:
> >
> > * The indicator, or the power button dialog calls 'initctl shutdown' with
> > the correct logout, reboot, or shutdown '--type' argument.
> > * The session init checks whether the shutdown has been blocked with the
> > org.gnome.SessionManager.Inhibit() D-Bus API call (which something
> > associated with the upstart session will need to implement).
> This would have to be handled by either a job or some sort of bridge to
> avoid making Upstart itself dependent on Gnome (that's why the current
> spec allows gnome-session to continue to handle the inhibiting bits
> itself).
That seems reasonable.
> > * Once the shutdown is not blocked, on shutdown/reboot:
> > * the session init, or a job acting on its behalf, calls the
> > corresponding method of ConsoleKit or systemd.
> Again, the spec allows a job to perform this action.
> > * ConsoleKit ultimately calls /usr/lib/ConsoleKit/scripts/ck-system-stop,
> > which in turn calls shutdown(8).
> > * shutdown(8) emits the "runlevel" Upstart event.
> > * This triggers stopping of the lightdm jobs (among others)
> > * The lightdm job sends SIGTERM to all clients (upstart session init
> > processes).
> > * The session init generates a session-end event, which is processed
> > normally.
> > * Either the user jobs correctly end in a timely manner in response to
> > this event; or they do not, in which case they are instead reaped by
> > /etc/init.d/sendsigs.
> I'm not clear what you mean here - are you saying that the Session Init
> *should not* send any signals to running jobs?
Yes. It should send some sort of 'logout' event, which most jobs should be
configured to stop on; but it should not set redundant timers for signalling
running jobs in the shutdown/reboot case, because the system init is already
responsible for that and having the session init also trying to signal job
processes adds unnecessary complexity.
> If so, this could only work for the shutdown/reboot scenario (for logout
> we still need to honour kill_timeout).
Yes, agreed.
> It also means that any jobs which do not specify 'stop on' will linger
> unnecessarily long.
First, any such job is buggy. Second, they will linger only as long as the
system is configured to let them linger. While in some cases it's important
to idiot-proof the system to avoid bugs caused by things like user jobs, in
this case I don't see the need.
> Note too that even in the shutdown/reboot scenario, if we just wait to
> die, we're still going to artificially slow down the shutdown since
> lightdm will wait 5 seconds for its clients to die.
That's only true if there are buggy jobs, which we should have none of in
the standard OS. If all the user jobs we ship are written properly, the
shutdown/reboot/logout event will cause a clean, *speedy* shutdown of the
user session, at which point the session init can exit because it knows the
system has quiesced.
The only case where we have to worry about timeouts at all is if a job is
misbehaving. In the 'logout' case, where we know the user has asked the
session to exit, yes we need to set a timeout and clean up anything that
takes to long to exit. But in the shutdown/reboot case, we should defer to
the system-level timeouts and handlers.
> There should not be any need for complex
> > timeout handling in the session init itself.
> The reason for this is threefold:
> - it allows upstart to give each job its allocated kill timeout before
> sending SIGKILL.
/etc/init.d/sendsigs still controls the maximum timeout here for
shutdown/reboot, and must continue to do so - a user level job must *not* be
allowed to indefinitely delay the system shutdown. That means the session
init only gives jobs its allocated time before SIGKILL if that time is
*less* than the global timeout; so this is still redundant.
> - it allows for a faster shutdown.
No, having properly written user processes that shut down quickly at the
appropriate time does this; the kill timer is a fallback for when you
already aren't getting a fast shutdown.
> - it ensures that the Session Init doesn't wait forever to shutdown
> (except where explicitly requested by $WAIT).
The system already ensures this by reaping stalled processes after its own
timeout :-)
> > (Note however that etc/init.d/sendsigs probably needs some changes
> > in order to not go killing user jobs - and session init processes
> > - with *no* delay, depriving them of the opportunity to shut down
> > gracefully.)
> > * If all session jobs exit and the session init is quiesced (i.e., no
> > remaining "blocked" events), the session init exits. Otherwise, it
> > will be killed externally (by lightdm or by sendsigs).
> Again, this only caters for the system shutdown scenario - we cannot wait
> forever for blocked events in the "logout" scenario as those events may
> never arrive (for example 'stop on session-end and never-emitted') thus
> causing the Session Init to never exit, hence the need for timeouts.
Yep, agreed - but /only/ in the logout case.
> > * lightdm exits.
> > * The rest of the shutdown sequence completes.
> Another source of confusion is that this doesn't seem to be how it
> currently happens in Ubuntu - I raised a bug on gnome-session [1] recently
> (which does suggest the system should work as you outline: desktop team
> investigating).
> [1] - https://bugs.launchpad.net/ubuntu/+source/gnome-session/+bug/1101154
Heh, oops :-)
> > This architecture has the following important properties:
> > - We don't assume the shutdown command has been accepted (and start
> > killing user jobs in response) until acknowledged as such by a root
> > daemon
> > - We preserve the existing interface for inhibiting shutdown that's
> > used by existing software
> > - We aren't relying on the user's session init to do the cleanup, beyond
> > making sure it tells the jobs to shutdown; instead the logic for timeout
> > handling is all at the system level, where it needs to be.
> > Does this make sense?
> So, what we're now saying is that for a logout we want to:
> (1) stop all running jobs.
> (2) exit as quickly as possible, but don't let anything block the exit
> (unless explicitly requested with $WAIT).
> (3) return control to lightdm.
> The current spec handles that I think.
Yes, agreed.
> Whereas for a shutdown/reboot we want to:
>
> (a) signal ConsoleKit via a job.
> (b) only start the shutdown sequence when signalled by lightdm, indicating
> that it is stopping.
> (c) exit as quickly as possible, ideally before being SIGKILL'ed by
> lightdm but only if $WAIT and the kill timeouts for running jobs permit.
Yes.
> The problems with the shutdown/reboot steps though are:
> (*) - $WAIT and kill timeout is ignored: if and when lightdm signals the
> Session Init, the window of opportunity to stop all jobs is 5 seconds
> since:
> - lightdm will SIGKILL the Session Init after this period.
> - even if lightdm doesn't, the kill timeout for the lightdm job is 5
> seconds too so lightdm will die within 5 seconds regardless.
This all sounds correct to me... it also sounds to me like what we want.
What do you see as a problem here?
Given that you earlier argued for the use of kill timeouts to get a "faster
shutdown", I don't see anything wrong with having a 5 second window to stop
all jobs. If it takes more than 5 seconds, I don't consider that a fast
shutdown at all.
And actually, thinking about it, the user jobs *don't* have a hard stop at 5
seconds, because the system init sends the SIGKILL to lightdm but not to its
children. So we can have this sequence:
- lightdm receives SIGTERM
- lightdm sends SIGTERM to session init
- session init emits 'session-end' event
- most user jobs shut down
- one user job {hangs,is misconfigured,takes a long time to shutdown} (pick
one)
- lightdm continues waiting for the session init to exit, and the session
init continues waiting for the wayward job
- after 5 seconds, lightdm sends SIGKILL to session init, which dies
immediately and any lingering job processes become sendsigs' problem.
- lightdm in turn finishes cleaning up and exits; barring that, system init
sends SIGKILL to lightdm, and it dies immediately
So if there are any jobs that didn't exit in a timely manner, they would
still be killed by sendsigs.
Furthermore, currently sendsigs may start trying to kill these processes
before the session init has a chance to, since /etc/init/rc.conf will run
/etc/init.d/rc [06] in parallel with the shutdown of lightdm. If we
anticipate session init needing to do a controlled shutdown of jobs, this
may require some modification of sendsigs as well.
> (*) - how we handle gnome-sessions's Inhibit? We can query if the session
> is inhibited, but we're not stopping the gnome-session job yet.
The shutdown/reboot/logout UI should check this before signalling either
consolekit or the session init. If the current session is inhibited, the
desktop should wait for this to clear in the normal way before taking the
requested action. If it's not, we can reasonably assume that this won't
change during the shutdown process.
Keep in mind that there may be multiple logged in users on the system, and
there may be apps running that want to inhibit the shutdown, but are
associated with a session *other* than the one from which a user requested
the shutdown. In this case, they don't get a vote at all; the system is
shutting down, and the user sessions need to make peace with their disk
state.
As the wiki page you linked to states:
> https://live.gnome.org/SessionManagement/GnomeSession#A5._QueryEndSession
"It should be stressed that, even when not forced, clients should not assume
that they will have the ability to block logout or shutdown."
> All this aside, the shutdown/reboot scenario is simplified such that we
> don't need to re-enable jobs in the quiescent mode (ugly) since the
> 'shutdown' job can be run immediately and all this job would do is call
> the appropriate ConsoleKit D-Bus API. That job will finish almost
> immediately as there is no return value from the CK D-Bus methods to
> control shutdown.
There's no reason for this action to be mediated by the session init. It
shouldn't be a 'shutdown' job, it should be a consolekit call from the UI.
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
slangasek at ubuntu.com vorlon at debian.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <https://lists.ubuntu.com/archives/upstart-devel/attachments/20130201/004e83b9/attachment.pgp>
More information about the upstart-devel
mailing list