[Bug 1591411] Re: systemd-logind must be restarted every ~1000 SSH logins to prevent a ~25 second delay

Nick Adams nick at ramnode.com
Wed Jul 17 18:45:50 UTC 2019


This still seems to be an issue. Running latest Bionic.

# dpkg -s dbus | grep Version
Version: 1.12.2-1ubuntu1.1

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to dbus in Ubuntu.
https://bugs.launchpad.net/bugs/1591411

Title:
  systemd-logind must be restarted every ~1000 SSH logins to prevent a
  ~25 second delay

Status in D-Bus:
  Fix Released
Status in systemd:
  Fix Released
Status in dbus package in Ubuntu:
  Fix Released
Status in systemd package in Ubuntu:
  Fix Released
Status in dbus source package in Xenial:
  Fix Released
Status in systemd source package in Xenial:
  Invalid
Status in dbus source package in Yakkety:
  Won't Fix
Status in systemd source package in Yakkety:
  Invalid

Bug description:
  [Impact]

  The bug affects multiple users and introduces an user visible delay
  (~25 seconds) on SSH connections after a large number of sessions have
  been processed. This has a serious impact on big systems and servers
  running our software.

  The currently proposed fix is actually a safe workaround for the bug
  as proposed by the dbus upstream. The workaround makes uid 0 immune to
  the pending_fd_timeout limit that kicks in and causes the original
  issue.

  [Test Case]

  lxc launch ubuntu:x test
  lxc exec test -- login -f ubuntu
  ssh-import-id <whatever>

  Then ran a script as follows (passing in ubuntu@<container-ip>):

  while [ 1 ]; do
      (time ssh $1 "echo OK > /dev/null") 2>&1 | grep ^real >> log
  done

  Then checking the log file if there are any ssh sessions that are
  taking 25+ seconds to complete.

  Multiple instances of the same script can be used at the same time.

  [Regression Potential]

  The fix has a rather low regression potential as the workaround is a
  very small change only affecting one particular case - handling of uid
  0. The fix has been tested by multiple users and has been around in
  zesty for a while, with multiple people involved in reviewing the
  change. It's also a change that has been proposed by upstream.

  [Original Description]

  I noticed on a system that accepts large numbers of SSH connections
  that after awhile, SSH sessions were taking ~25 seconds to complete.

  Looking in /var/log/auth.log, systemd-logind starts failing with the
  following:

  Jun 10 23:55:28 test sshd[3666]: pam_unix(sshd:session): session opened for user ubuntu by (uid=0)
  Jun 10 23:55:28 test systemd-logind[105]: New session c1052 of user ubuntu.
  Jun 10 23:55:28 test systemd-logind[105]: Failed to abandon session scope: Transport endpoint is not connected
  Jun 10 23:55:28 test sshd[3666]: pam_systemd(sshd:session): Failed to create session: Message recipient disconnected from message bus without replying

  I reproduced this in an LXD container by doing something like:

  lxc launch ubuntu:x test
  lxc exec test -- login -f ubuntu
  ssh-import-id <whatever>

  Then ran a script as follows (passing in ubuntu@<container-ip>):

  while [ 1 ]; do
      (time ssh $1 "echo OK > /dev/null") 2>&1 | grep ^real >> log
  done

  In my case, after 1052 logins, the 1053rd and thereafter were taking
  25+ seconds to complete. Here are some snippets from the log file:

  $ cat log | grep 0m0 | wc -l
  1052

  $ cat log | grep 0m25 | wc -l
  4

  $ tail -5 log
  real	0m0.222s
  real	0m25.232s
  real	0m25.235s
  real	0m25.236s
  real	0m25.239s

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: systemd 229-4ubuntu5
  ProcVersionSignature: Ubuntu 4.4.0-22.40-generic 4.4.8
  Uname: Linux 4.4.0-22-generic x86_64
  ApportVersion: 2.20.1-0ubuntu2
  Architecture: amd64
  Date: Sat Jun 11 00:09:34 2016
  MachineType: Notebook W230SS
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-22-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash
  SourcePackage: systemd
  SystemdDelta:
   [EXTENDED]   /lib/systemd/system/rc-local.service → /lib/systemd/system/rc-local.service.d/debian.conf
   [EXTENDED]   /lib/systemd/system/systemd-timesyncd.service → /lib/systemd/system/systemd-timesyncd.service.d/disable-with-time-daemon.conf

   2 overridden configuration files found.
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 04/15/2014
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 4.6.5
  dmi.board.asset.tag: Tag 12345
  dmi.board.name: W230SS
  dmi.board.vendor: Notebook
  dmi.board.version: Not Applicable
  dmi.chassis.asset.tag: No Asset Tag
  dmi.chassis.type: 9
  dmi.chassis.vendor: Notebook
  dmi.chassis.version: N/A
  dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr4.6.5:bd04/15/2014:svnNotebook:pnW230SS:pvrNotApplicable:rvnNotebook:rnW230SS:rvrNotApplicable:cvnNotebook:ct9:cvrN/A:
  dmi.product.name: W230SS
  dmi.product.version: Not Applicable
  dmi.sys.vendor: Notebook

To manage notifications about this bug go to:
https://bugs.launchpad.net/dbus/+bug/1591411/+subscriptions



More information about the foundations-bugs mailing list