[Bug 687535] Re: upstart loses track of ssh daemon after reload ssh

Mon Sep 19 21:40:54 UTC 2011

** Tags added: testcase

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to openssh in Ubuntu.
https://bugs.launchpad.net/bugs/687535

Title:
  upstart loses track of ssh daemon after reload ssh

Status in Upstart:
  Invalid
Status in “openssh” package in Ubuntu:
  Fix Released
Status in “openssh” source package in Lucid:
  Fix Released
Status in “openssh” source package in Maverick:
  Fix Released

Bug description:
  When sshd gets a signal 1 for reload, it forks a new process and
  ditches the old. This causes upstart to believe that ssh has crashed,
  and loses track of it. A second reload (or any other initctl operation
  on ssh) will thus say:

  reload: Unknown instance:

  There would be 2 ways to fix this:
  1.  Don't have ssh fork on relod, but keep the same pid
  2. Use a different mechanism in upstart to keep track of ssh. Maybe a pid file? Just tracking children of the exited ssh won't work, or it might accidentally track a particular session rather than the master, if somebody just happens to log in close to reload time.

  openssh-server  1:5.3p1-3ubuntu4
  upstart         0.6.5-7

  ==== Info for Maverick, Lucid SRU ====
  IMPACT: if sshd gets a HUP signal, it forks a new process and upstart thinks the process died and loses track of it, so the user/admin uses the ability to stop/start/reload the daemon through upstart.
  The problem is fixed in Natty 5.6p1-2ubuntu3. See attached patches for Maverick and Lucid.

  TEST CASE:

  - install openssh-server
  - send a HUP signal to sshd
  - the daemon is restarted, but upstart thinks that it crashed (/var/log/daemon.log):

  Dec 28 20:59:57 utest-lls32 init: ssh main process ended, respawning
  Dec 28 20:59:57 utest-lls32 init: ssh main process (1451) terminated with status 255
  Dec 28 20:59:57 utest-lls32 init: ssh main process ended, respawning
  Dec 28 20:59:57 utest-lls32 init: ssh main process (1455) terminated with status 255
  Dec 28 20:59:57 utest-lls32 init: ssh respawning too fast, stopped

  - after this, upstart won't know about sshd, despite the daemon
  running just fine:

  root at utest-lls32:~# reload ssh
  reload: Unknown instance:

  With the fix applied, the correct behavior is:

  - send a HUP signal to sshd
    ps ax |grep sshd
    kill -HUP sshd
  - the daemon reloads (/var/log/auth.log):

  Dec 28 21:37:01 utest-lls32 sshd[742]: Received SIGHUP; restarting.
  Dec 28 21:37:01 utest-lls32 sshd[742]: Server listening on 0.0.0.0 port 22.
  Dec 28 21:37:01 utest-lls32 sshd[742]: Server listening on :: port 22.

  - reloading with upstart gives the same result, and NOT an error
  message.

  REGRESSION POTENTIAL:

  There is a small race condition in sshd between when it forks, and
  when it listens for incoming connections. The length of the race is
  lengthened by a very tiny amount by considering sshd started as soon
  as it has been executed, rather than when it forks. This will only
  affect jobs that use 'start on started ssh' and immediately connect to
  it. This is unlikely to cause problems in any real world scenario,
  given that most of these programs would also have to fork, exec, and
  open a socket, which is more work than what sshd will be doing in that
  time.

To manage notifications about this bug go to:
https://bugs.launchpad.net/upstart/+bug/687535/+subscriptions