[Bug 2011458] Re: ssh fails to rebind when it is killed with -HUP

Steve Langasek 2011458 at bugs.launchpad.net
Fri Apr 28 17:40:01 UTC 2023


Nick, note that it's safe to stop the primary ssh service despite there
being an open connection, as open connections are left running.  So
maybe that's an easier approach.

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/2011458

Title:
  ssh fails to rebind when it is killed with -HUP

Status in openssh package in Ubuntu:
  In Progress
Status in openssh source package in Kinetic:
  In Progress

Bug description:
  [Impact]

  The sshd re-execution logic is generally broken with systemd socket activation, which means that (1) sshd fails when it is told to re-exec
  via SIGHUP (e.g. systemctl reload ssh), and (2) sshd fails when started in debug mode.

  [Test Case]

  (1) Test systemctl reload ssh:

  * On a machine with openssh-server installed, make a connection to
  localhost to activate ssh.service (the connection does not need to be
  complete, so you can just say "no" at the host key verification
  stage):

  $ ssh localhost
  [...]

  * Send SIGHUP to sshd by calling systemctl reload ssh:

  $ systemctl reload ssh

  * Check the service state:

  $ systemctl status ssh
  × ssh.service - OpenBSD Secure Shell server
       Loaded: loaded (/lib/systemd/system/ssh.service; disabled; preset: enabled)
      Drop-In: /etc/systemd/system/ssh.service.d
               └─00-socket.conf
       Active: failed (Result: exit-code) since Mon 2023-04-17 20:43:27 UTC; 4s ago
     Duration: 2min 44.132s
  TriggeredBy: ● ssh.socket
         Docs: man:sshd(8)
               man:sshd_config(5)
      Process: 1112 ExecStart=/usr/sbin/sshd -D $SSHD_OPTS (code=exited, status=255/EXCEPTION)
      Process: 1152 ExecReload=/usr/sbin/sshd -t (code=exited, status=0/SUCCESS)
      Process: 1153 ExecReload=/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS)
     Main PID: 1112 (code=exited, status=255/EXCEPTION)
          CPU: 79ms

  Apr 17 20:40:43 lunar systemd[1]: Started ssh.service - OpenBSD Secure Shell server.
  Apr 17 20:41:06 lunar sshd[1113]: Connection closed by 127.0.0.1 port 54666 [preauth]
  Apr 17 20:43:27 lunar systemd[1]: Reloading ssh.service - OpenBSD Secure Shell server...
  Apr 17 20:43:27 lunar sshd[1112]: Received SIGHUP; restarting.
  Apr 17 20:43:27 lunar systemd[1]: Reloaded ssh.service - OpenBSD Secure Shell server.
  Apr 17 20:43:27 lunar sshd[1112]: error: Bind to port 22 on 0.0.0.0 failed: Address already in use.
  Apr 17 20:43:27 lunar sshd[1112]: error: Bind to port 22 on :: failed: Address already in use.
  Apr 17 20:43:27 lunar sshd[1112]: fatal: Cannot bind any address.
  Apr 17 20:43:27 lunar systemd[1]: ssh.service: Main process exited, code=exited, status=255/EXCEPTION
  Apr 17 20:43:27 lunar systemd[1]: ssh.service: Failed with result 'exit-code'.

  * On an affected machine, the service will fail as shown above.

  (2) Test debug mode:

  * On a machine with openssh-server installed, edit /etc/default/ssh to
  configure debug mode for sshd:

  $ cat /etc/default/ssh 
  # Default settings for openssh-server. This file is sourced by /bin/sh from
  # /etc/init.d/ssh.

  # Options to pass to sshd
  SSHD_OPTS=-ddd

  * Attempt to make a connection to localhost:

  $ ssh localhost
  kex_exchange_identification: read: Connection reset by peer
  Connection reset by 127.0.0.1 port 22

  * On an affected machine, the attempt will fail as shown above, and
  the service will be in a failed state:

  $ systemctl status ssh
  × ssh.service - OpenBSD Secure Shell server
       Loaded: loaded (/lib/systemd/system/ssh.service; disabled; preset: enabled)
      Drop-In: /etc/systemd/system/ssh.service.d
               └─00-socket.conf
       Active: failed (Result: exit-code) since Mon 2023-04-17 20:46:34 UTC; 2min 27s ago
     Duration: 5ms
  TriggeredBy: ● ssh.socket
         Docs: man:sshd(8)
               man:sshd_config(5)
      Process: 1166 ExecStartPre=/usr/sbin/sshd -t (code=exited, status=0/SUCCESS)
      Process: 1167 ExecStart=/usr/sbin/sshd -D $SSHD_OPTS (code=exited, status=255/EXCEPTION)
     Main PID: 1167 (code=exited, status=255/EXCEPTION)
          CPU: 40ms

  Apr 17 20:46:34 lunar sshd[1167]: Server listening on :: port 22.
  Apr 17 20:46:34 lunar sshd[1167]: debug3: fd 4 is not O_NONBLOCK
  Apr 17 20:46:34 lunar sshd[1167]: debug1: Server will not fork when running in debugging mode.
  Apr 17 20:46:34 lunar sshd[1167]: debug3: send_rexec_state: entering fd = 7 config len 3456
  Apr 17 20:46:34 lunar sshd[1167]: debug3: ssh_msg_send: type 0
  Apr 17 20:46:34 lunar sshd[1167]: debug3: send_rexec_state: done
  Apr 17 20:46:34 lunar sshd[1167]: debug1: rexec start in 4 out 4 newsock 4 pipe -1 sock 7
  Apr 17 20:46:34 lunar systemd[1]: Started ssh.service - OpenBSD Secure Shell server.
  Apr 17 20:46:34 lunar systemd[1]: ssh.service: Main process exited, code=exited, status=255/EXCEPTION
  Apr 17 20:46:34 lunar systemd[1]: ssh.service: Failed with result 'exit-code'.

  [Where problems could occur]

  The fix expands Ubuntu's patch for systemd socket activation to try
  and make sure that any fds passed from systemd are not closed across
  re-executions of sshd. If we saw a problem, it would most likely be an
  attempt to operate on a closed fd, or the wrong fd, as a result of an
  edge case in one of the re-execution paths.

  [Original Description]

  In kinetic and lunar gce images we are facing an issue when ssh is being killed with -HUP
  SSH is failing to rebind port 22. It is not failing in other previous systems.

  It can be reproduced by running

  # pkill -o -HUP sshd || true
  # journalctl -n 20
  Mar 13 14:58:52 mar131454-025105 sshd[1371]: Received SIGHUP; restarting.
  Mar 13 14:58:52 mar131454-025105 sshd[1371]: error: Bind to port 22 on 0.0.0.0 failed: Address already in use.
  Mar 13 14:58:52 mar131454-025105 sshd[1371]: error: Bind to port 22 on :: failed: Address already in use.
  Mar 13 14:58:52 mar131454-025105 sshd[1371]: fatal: Cannot bind any address.
  Mar 13 14:58:52 mar131454-025105 systemd[1]: ssh.service: Main process exited, code=exited, status=255/EXCEPTION
  Mar 13 14:58:52 mar131454-025105 systemd[1]: ssh.service: Failed with result 'exit-code'.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/openssh/+bug/2011458/+subscriptions




More information about the Ubuntu-sponsors mailing list