[Bug 1928526] [NEW] DHCP Cluster crashes randomly

Phil Fenstermacher 1928526 at bugs.launchpad.net
Sat May 15 04:23:35 UTC 2021


Public bug reported:

We're running in to a case that's very similar to
https://bugs.launchpad.net/dhcp/+bug/1872118 where the DHCP server
periodically crashes. Crashes sometimes take days to happen on
relatively busy DHCP servers. There's not an obvious reason, however the
logs end up showing (hostnames and email address changed slightly to
keep meaning, but not disclose everything).

May 14 05:22:17 is-landlord-04.wm.edu sh[305109]: ../../../../lib/isc/unix/socket.c:4359: fatal error: select() failed: Bad file descriptor
May 14 05:22:27 hostname.wm.edu systemd[1]: isc-dhcp-server.service: Main process exited, code=killed, status=6/ABRT
May 14 05:22:32 is-landlord-04.wm.edu sSMTP[1645996]: Sent mail for noreply at wm.edu (221 2.0.0 Service closing transmission channel) uid=0 username=root ou>
May 14 05:22:32 is-landlord-04.wm.edu systemd[1]: isc-dhcp-server.service: Failed with result 'signal'.
May 14 05:22:37 is-landlord-04.wm.edu systemd[1]: isc-dhcp-server.service: Scheduled restart job, restart counter is at 25.
May 14 05:22:37 is-landlord-04.wm.edu systemd[1]: Stopped ISC DHCP IPv4 server.
May 14 05:22:37 is-landlord-04.wm.edu systemd[1]: Started ISC DHCP IPv4 server.

$ sudo cat _usr_sbin_dhcpd.0.crash
ProblemType: Crash
Architecture: amd64
Date: Fri May 14 05:22:21 2021
DistroRelease: Ubuntu 20.04
ExecutablePath: /usr/sbin/dhcpd
ExecutableTimestamp: 1615303185
ProcCmdline: dhcpd -user dhcpd -group dhcpd -f -4 -pf /run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf
ProcEnviron: Error: [Errno 13] Permission denied: 'environ'
ProcMaps: Error: [Errno 13] Permission denied: 'maps'
ProcStatus:
 Name:	dhcpd
 Umask:	0022
 State:	D (disk sleep)
 Tgid:	305109
 Ngid:	0
 Pid:	305109
 PPid:	1
 TracerPid:	0
 Uid:	116	116	116	116
 Gid:	119	119	119	119
 FDSize:	256
 Groups:
 NStgid:	305109
 NSpid:	305109
 NSpgid:	305109
 NSsid:	305109
 VmPeak:	  422372 kB
 VmSize:	  356836 kB
 VmLck:	       0 kB
 VmPin:	       0 kB
 VmHWM:	  289404 kB
 VmRSS:	  258028 kB
 RssAnon:	  255560 kB
 RssFile:	    2468 kB
 RssShmem:	       0 kB
 VmData:	  282632 kB
 VmStk:	     132 kB
 VmExe:	     592 kB
 VmLib:	    5352 kB
 VmPTE:	     584 kB
 VmSwap:	    2496 kB
 HugetlbPages:	       0 kB
 CoreDumping:	1
 THP_enabled:	1
 Threads:	4
 SigQ:	0/3521
 SigPnd:	0000000000000000
 ShdPnd:	0000000000000000
 SigBlk:	0000000000000000
 SigIgn:	0000000000001000
 SigCgt:	0000000180000000
 CapInh:	0000000000000000
 CapPrm:	0000000000000000
 CapEff:	0000000000000000
 CapBnd:	0000003fffffffff
 CapAmb:	0000000000000000
 NoNewPrivs:	0
 Seccomp:	0
 Speculation_Store_Bypass:	thread vulnerable
 Cpus_allowed:	ffffffff,ffffffff
 Cpus_allowed_list:	0-63
 Mems_allowed:	00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
 Mems_allowed_list:	0
 voluntary_ctxt_switches:	94
 nonvoluntary_ctxt_switches:	831
Signal: 6
Uname: Linux 5.4.0-71-generic x86_64
UserGroups: N/A

Because of the similarities with #1872118 we started by checking various
package versions:

$ sudo dpkg -l | grep -i isc-
ii  isc-dhcp-client                      4.4.1-2.1ubuntu5.20.04.1          amd64        DHCP client for automatically obtaining an IP address
ii  isc-dhcp-common                      4.4.1-2.1ubuntu5.20.04.1          amd64        common manpages relevant to all of the isc-dhcp packages
ii  isc-dhcp-server                      4.4.1-2.1ubuntu5.20.04.1          amd64        ISC DHCP server for automatic IP address assignment
ii  libisc-export1105:amd64              1:9.11.16+dfsg-3~ubuntu1          amd64        Exported ISC Shared Library

$ sudo dpkg -l | grep -i bind9-libs
ii  bind9-libs:amd64                     1:9.16.1-0ubuntu2.8               amd64        Shared Libraries used by BIND 9

$ sudo dpkg -l | grep -i libisc-*
ii  libisc-export1105:amd64              1:9.11.16+dfsg-3~ubuntu1          amd64        Exported ISC Shared Library
ii  libisccfg-export163                  1:9.11.16+dfsg-3~ubuntu1          amd64        Exported ISC CFG Shared Library

$ sudo dpkg -l | grep -i libirs-*
ii  libirs-export161                     1:9.11.16+dfsg-3~ubuntu1          amd64        Exported IRS Shared Library

$ sudo dpkg -l | grep -i libdns-*
ii  libdns-export1109                    1:9.11.16+dfsg-3~ubuntu1          amd64        Exported DNS Shared Library

Finally, we're running Ubuntu 20.04.2:
$ lsb_release -rd
Description:	Ubuntu 20.04.2 LTS
Release:	20.04

Let us know what else can be provided to help confirm or troubleshoot
the issue.

** Affects: isc-dhcp (Ubuntu)
     Importance: Undecided
         Status: New

** Description changed:

- We're running in to a case that's very similar to #1872118 where the
- DHCP server periodically crashes. Crashes sometimes take days to happen
- on relatively busy DHCP servers. There's not an obvious reason, however
- the logs end up showing (hostnames and email address changed slightly to
+ We're running in to a case that's very similar to
+ https://bugs.launchpad.net/dhcp/+bug/1872118 where the DHCP server
+ periodically crashes. Crashes sometimes take days to happen on
+ relatively busy DHCP servers. There's not an obvious reason, however the
+ logs end up showing (hostnames and email address changed slightly to
  keep meaning, but not disclose everything).
  
  May 14 05:22:17 is-landlord-04.wm.edu sh[305109]: ../../../../lib/isc/unix/socket.c:4359: fatal error: select() failed: Bad file descriptor
  May 14 05:22:27 hostname.wm.edu systemd[1]: isc-dhcp-server.service: Main process exited, code=killed, status=6/ABRT
  May 14 05:22:32 is-landlord-04.wm.edu sSMTP[1645996]: Sent mail for noreply at wm.edu (221 2.0.0 Service closing transmission channel) uid=0 username=root ou>
  May 14 05:22:32 is-landlord-04.wm.edu systemd[1]: isc-dhcp-server.service: Failed with result 'signal'.
  May 14 05:22:37 is-landlord-04.wm.edu systemd[1]: isc-dhcp-server.service: Scheduled restart job, restart counter is at 25.
  May 14 05:22:37 is-landlord-04.wm.edu systemd[1]: Stopped ISC DHCP IPv4 server.
  May 14 05:22:37 is-landlord-04.wm.edu systemd[1]: Started ISC DHCP IPv4 server.
  
- 
- $ sudo cat _usr_sbin_dhcpd.0.crash 
+ $ sudo cat _usr_sbin_dhcpd.0.crash
  ProblemType: Crash
  Architecture: amd64
  Date: Fri May 14 05:22:21 2021
  DistroRelease: Ubuntu 20.04
  ExecutablePath: /usr/sbin/dhcpd
  ExecutableTimestamp: 1615303185
  ProcCmdline: dhcpd -user dhcpd -group dhcpd -f -4 -pf /run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf
  ProcEnviron: Error: [Errno 13] Permission denied: 'environ'
  ProcMaps: Error: [Errno 13] Permission denied: 'maps'
  ProcStatus:
-  Name:	dhcpd
-  Umask:	0022
-  State:	D (disk sleep)
-  Tgid:	305109
-  Ngid:	0
-  Pid:	305109
-  PPid:	1
-  TracerPid:	0
-  Uid:	116	116	116	116
-  Gid:	119	119	119	119
-  FDSize:	256
-  Groups:	 
-  NStgid:	305109
-  NSpid:	305109
-  NSpgid:	305109
-  NSsid:	305109
-  VmPeak:	  422372 kB
-  VmSize:	  356836 kB
-  VmLck:	       0 kB
-  VmPin:	       0 kB
-  VmHWM:	  289404 kB
-  VmRSS:	  258028 kB
-  RssAnon:	  255560 kB
-  RssFile:	    2468 kB
-  RssShmem:	       0 kB
-  VmData:	  282632 kB
-  VmStk:	     132 kB
-  VmExe:	     592 kB
-  VmLib:	    5352 kB
-  VmPTE:	     584 kB
-  VmSwap:	    2496 kB
-  HugetlbPages:	       0 kB
-  CoreDumping:	1
-  THP_enabled:	1
-  Threads:	4
-  SigQ:	0/3521
-  SigPnd:	0000000000000000
-  ShdPnd:	0000000000000000
-  SigBlk:	0000000000000000
-  SigIgn:	0000000000001000
-  SigCgt:	0000000180000000
-  CapInh:	0000000000000000
-  CapPrm:	0000000000000000
-  CapEff:	0000000000000000
-  CapBnd:	0000003fffffffff
-  CapAmb:	0000000000000000
-  NoNewPrivs:	0
-  Seccomp:	0
-  Speculation_Store_Bypass:	thread vulnerable
-  Cpus_allowed:	ffffffff,ffffffff
-  Cpus_allowed_list:	0-63
-  Mems_allowed:	00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
-  Mems_allowed_list:	0
-  voluntary_ctxt_switches:	94
-  nonvoluntary_ctxt_switches:	831
+  Name:	dhcpd
+  Umask:	0022
+  State:	D (disk sleep)
+  Tgid:	305109
+  Ngid:	0
+  Pid:	305109
+  PPid:	1
+  TracerPid:	0
+  Uid:	116	116	116	116
+  Gid:	119	119	119	119
+  FDSize:	256
+  Groups:
+  NStgid:	305109
+  NSpid:	305109
+  NSpgid:	305109
+  NSsid:	305109
+  VmPeak:	  422372 kB
+  VmSize:	  356836 kB
+  VmLck:	       0 kB
+  VmPin:	       0 kB
+  VmHWM:	  289404 kB
+  VmRSS:	  258028 kB
+  RssAnon:	  255560 kB
+  RssFile:	    2468 kB
+  RssShmem:	       0 kB
+  VmData:	  282632 kB
+  VmStk:	     132 kB
+  VmExe:	     592 kB
+  VmLib:	    5352 kB
+  VmPTE:	     584 kB
+  VmSwap:	    2496 kB
+  HugetlbPages:	       0 kB
+  CoreDumping:	1
+  THP_enabled:	1
+  Threads:	4
+  SigQ:	0/3521
+  SigPnd:	0000000000000000
+  ShdPnd:	0000000000000000
+  SigBlk:	0000000000000000
+  SigIgn:	0000000000001000
+  SigCgt:	0000000180000000
+  CapInh:	0000000000000000
+  CapPrm:	0000000000000000
+  CapEff:	0000000000000000
+  CapBnd:	0000003fffffffff
+  CapAmb:	0000000000000000
+  NoNewPrivs:	0
+  Seccomp:	0
+  Speculation_Store_Bypass:	thread vulnerable
+  Cpus_allowed:	ffffffff,ffffffff
+  Cpus_allowed_list:	0-63
+  Mems_allowed:	00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
+  Mems_allowed_list:	0
+  voluntary_ctxt_switches:	94
+  nonvoluntary_ctxt_switches:	831
  Signal: 6
  Uname: Linux 5.4.0-71-generic x86_64
  UserGroups: N/A
  
- 
- Because of the similarities with #1872118 we started by checking various package versions:
+ Because of the similarities with #1872118 we started by checking various
+ package versions:
  
  $ sudo dpkg -l | grep -i isc-
  ii  isc-dhcp-client                      4.4.1-2.1ubuntu5.20.04.1          amd64        DHCP client for automatically obtaining an IP address
  ii  isc-dhcp-common                      4.4.1-2.1ubuntu5.20.04.1          amd64        common manpages relevant to all of the isc-dhcp packages
  ii  isc-dhcp-server                      4.4.1-2.1ubuntu5.20.04.1          amd64        ISC DHCP server for automatic IP address assignment
  ii  libisc-export1105:amd64              1:9.11.16+dfsg-3~ubuntu1          amd64        Exported ISC Shared Library
  
  $ sudo dpkg -l | grep -i bind9-libs
  ii  bind9-libs:amd64                     1:9.16.1-0ubuntu2.8               amd64        Shared Libraries used by BIND 9
  
  $ sudo dpkg -l | grep -i libisc-*
  ii  libisc-export1105:amd64              1:9.11.16+dfsg-3~ubuntu1          amd64        Exported ISC Shared Library
  ii  libisccfg-export163                  1:9.11.16+dfsg-3~ubuntu1          amd64        Exported ISC CFG Shared Library
  
  $ sudo dpkg -l | grep -i libirs-*
  ii  libirs-export161                     1:9.11.16+dfsg-3~ubuntu1          amd64        Exported IRS Shared Library
  
  $ sudo dpkg -l | grep -i libdns-*
  ii  libdns-export1109                    1:9.11.16+dfsg-3~ubuntu1          amd64        Exported DNS Shared Library
  
  Finally, we're running Ubuntu 20.04.2:
  $ lsb_release -rd
  Description:	Ubuntu 20.04.2 LTS
  Release:	20.04
  
  Let us know what else can be provided to help confirm or troubleshoot
  the issue.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to isc-dhcp in Ubuntu.
https://bugs.launchpad.net/bugs/1928526

Title:
  DHCP Cluster crashes randomly

Status in isc-dhcp package in Ubuntu:
  New

Bug description:
  We're running in to a case that's very similar to
  https://bugs.launchpad.net/dhcp/+bug/1872118 where the DHCP server
  periodically crashes. Crashes sometimes take days to happen on
  relatively busy DHCP servers. There's not an obvious reason, however
  the logs end up showing (hostnames and email address changed slightly
  to keep meaning, but not disclose everything).

  May 14 05:22:17 is-landlord-04.wm.edu sh[305109]: ../../../../lib/isc/unix/socket.c:4359: fatal error: select() failed: Bad file descriptor
  May 14 05:22:27 hostname.wm.edu systemd[1]: isc-dhcp-server.service: Main process exited, code=killed, status=6/ABRT
  May 14 05:22:32 is-landlord-04.wm.edu sSMTP[1645996]: Sent mail for noreply at wm.edu (221 2.0.0 Service closing transmission channel) uid=0 username=root ou>
  May 14 05:22:32 is-landlord-04.wm.edu systemd[1]: isc-dhcp-server.service: Failed with result 'signal'.
  May 14 05:22:37 is-landlord-04.wm.edu systemd[1]: isc-dhcp-server.service: Scheduled restart job, restart counter is at 25.
  May 14 05:22:37 is-landlord-04.wm.edu systemd[1]: Stopped ISC DHCP IPv4 server.
  May 14 05:22:37 is-landlord-04.wm.edu systemd[1]: Started ISC DHCP IPv4 server.

  $ sudo cat _usr_sbin_dhcpd.0.crash
  ProblemType: Crash
  Architecture: amd64
  Date: Fri May 14 05:22:21 2021
  DistroRelease: Ubuntu 20.04
  ExecutablePath: /usr/sbin/dhcpd
  ExecutableTimestamp: 1615303185
  ProcCmdline: dhcpd -user dhcpd -group dhcpd -f -4 -pf /run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf
  ProcEnviron: Error: [Errno 13] Permission denied: 'environ'
  ProcMaps: Error: [Errno 13] Permission denied: 'maps'
  ProcStatus:
   Name:	dhcpd
   Umask:	0022
   State:	D (disk sleep)
   Tgid:	305109
   Ngid:	0
   Pid:	305109
   PPid:	1
   TracerPid:	0
   Uid:	116	116	116	116
   Gid:	119	119	119	119
   FDSize:	256
   Groups:
   NStgid:	305109
   NSpid:	305109
   NSpgid:	305109
   NSsid:	305109
   VmPeak:	  422372 kB
   VmSize:	  356836 kB
   VmLck:	       0 kB
   VmPin:	       0 kB
   VmHWM:	  289404 kB
   VmRSS:	  258028 kB
   RssAnon:	  255560 kB
   RssFile:	    2468 kB
   RssShmem:	       0 kB
   VmData:	  282632 kB
   VmStk:	     132 kB
   VmExe:	     592 kB
   VmLib:	    5352 kB
   VmPTE:	     584 kB
   VmSwap:	    2496 kB
   HugetlbPages:	       0 kB
   CoreDumping:	1
   THP_enabled:	1
   Threads:	4
   SigQ:	0/3521
   SigPnd:	0000000000000000
   ShdPnd:	0000000000000000
   SigBlk:	0000000000000000
   SigIgn:	0000000000001000
   SigCgt:	0000000180000000
   CapInh:	0000000000000000
   CapPrm:	0000000000000000
   CapEff:	0000000000000000
   CapBnd:	0000003fffffffff
   CapAmb:	0000000000000000
   NoNewPrivs:	0
   Seccomp:	0
   Speculation_Store_Bypass:	thread vulnerable
   Cpus_allowed:	ffffffff,ffffffff
   Cpus_allowed_list:	0-63
   Mems_allowed:	00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
   Mems_allowed_list:	0
   voluntary_ctxt_switches:	94
   nonvoluntary_ctxt_switches:	831
  Signal: 6
  Uname: Linux 5.4.0-71-generic x86_64
  UserGroups: N/A

  Because of the similarities with #1872118 we started by checking
  various package versions:

  $ sudo dpkg -l | grep -i isc-
  ii  isc-dhcp-client                      4.4.1-2.1ubuntu5.20.04.1          amd64        DHCP client for automatically obtaining an IP address
  ii  isc-dhcp-common                      4.4.1-2.1ubuntu5.20.04.1          amd64        common manpages relevant to all of the isc-dhcp packages
  ii  isc-dhcp-server                      4.4.1-2.1ubuntu5.20.04.1          amd64        ISC DHCP server for automatic IP address assignment
  ii  libisc-export1105:amd64              1:9.11.16+dfsg-3~ubuntu1          amd64        Exported ISC Shared Library

  $ sudo dpkg -l | grep -i bind9-libs
  ii  bind9-libs:amd64                     1:9.16.1-0ubuntu2.8               amd64        Shared Libraries used by BIND 9

  $ sudo dpkg -l | grep -i libisc-*
  ii  libisc-export1105:amd64              1:9.11.16+dfsg-3~ubuntu1          amd64        Exported ISC Shared Library
  ii  libisccfg-export163                  1:9.11.16+dfsg-3~ubuntu1          amd64        Exported ISC CFG Shared Library

  $ sudo dpkg -l | grep -i libirs-*
  ii  libirs-export161                     1:9.11.16+dfsg-3~ubuntu1          amd64        Exported IRS Shared Library

  $ sudo dpkg -l | grep -i libdns-*
  ii  libdns-export1109                    1:9.11.16+dfsg-3~ubuntu1          amd64        Exported DNS Shared Library

  Finally, we're running Ubuntu 20.04.2:
  $ lsb_release -rd
  Description:	Ubuntu 20.04.2 LTS
  Release:	20.04

  Let us know what else can be provided to help confirm or troubleshoot
  the issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1928526/+subscriptions



More information about the foundations-bugs mailing list