[Bug 1872118] Re: DHCP Cluster crashes after a few hours
Jorge Niedbalski
1872118 at bugs.launchpad.net
Mon Aug 3 21:00:44 UTC 2020
Hello,
I am trying to setup a reproducer for the mentioned issue. I have 2
machines acting as peers with the following versions:
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.1 LTS
Release: 20.04
Codename: focal
# dpkg -l |grep -i isc-dh
ii isc-dhcp-client 4.4.1-2.1ubuntu5 amd64 DHCP client for automatically obtaining an IP address
ii isc-dhcp-common 4.4.1-2.1ubuntu5 amd64 common manpages relevant to all of the isc-dhcp packages
ii isc-dhcp-server 4.4.1-2.1ubuntu5 amd64 ISC DHCP server for automatic IP address assignment
=====
Primary configuration: https://pastebin.ubuntu.com/p/XYj648MghK/
Secondary configuration: https://pastebin.ubuntu.com/p/PYkcshZCWK/
Started with:
# dhcpd -f -d -4 -cf /etc/dhcp/dhcpd.conf --no-pid ens4
---> Raised some DHCP requests to these servers.
balanced pool 560b8c263f40 12 total 221 free 111 backup 110 lts 0 max-misbal 33
Sending updates to failover-partner.
failover peer failover-partner: peer moves from recover-done to normal
failover peer failover-partner: Both servers normal
DHCPDISCOVER from 52:54:00:2d:53:93 via ens4
DHCPOFFER on 10.19.101.120 to 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPREQUEST for 10.19.101.120 (10.19.101.236) from 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPACK on 10.19.101.120 to 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPREQUEST for 10.19.101.120 from 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPACK on 10.19.101.120 to 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPREQUEST for 10.19.101.121 from 52:54:00:53:a3:d8 (valiant-motmot) via ens4
DHCPACK on 10.19.101.121 to 52:54:00:53:a3:d8 (valiant-motmot) via ens4
---
failover peer failover-partner: Both servers normal
balancing pool 5606b2c95f10 12 total 221 free 221 backup 0 lts -110 max-own (+/-)22
balanced pool 5606b2c95f10 12 total 221 free 221 backup 0 lts -110 max-misbal 33
balancing pool 5606b2c95f10 12 total 221 free 111 backup 110 lts 0 max-own (+/-)22
balanced pool 5606b2c95f10 12 total 221 free 111 backup 110 lts 0 max-misbal 33
DHCPDISCOVER from 52:54:00:2d:53:93 via ens4: load balance to peer failover-partner
DHCPREQUEST for 10.19.101.120 (10.19.101.236) from 52:54:00:2d:53:93 via ens4: lease owned by peer
So far (after 1.5h) no crash has been reported in any of the servers.
Questions:
1) Anything missed from the provided configuration?
2) Is this load or concurrency related? meaning a specific amount of leases needs to be allocated for this crash to happen?
I will take a look to an existing crash/coredump.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to isc-dhcp in Ubuntu.
https://bugs.launchpad.net/bugs/1872118
Title:
DHCP Cluster crashes after a few hours
Status in DHCP:
New
Status in isc-dhcp package in Ubuntu:
Confirmed
Bug description:
I have a pair of DHCP serevrs running in a cluster on ubuntu 20.04, All worked perfectly until recently, when they started stopping with code=killed, status=6/ABRT.
This is being fixed by
https://bugs.launchpad.net/bugs/1870729
However now one stops after a few hours with the following errors. One
can stay on line but not both.
Syslog shows
Apr 10 17:20:15 dhcp-primary sh[6828]: ../../../../lib/isc/unix/socket.c:3361: INSIST(!sock->pending_send) failed, back trace
Apr 10 17:20:15 dhcp-primary sh[6828]: #0 0x7fbe78702a4a in ??
Apr 10 17:20:15 dhcp-primary sh[6828]: #1 0x7fbe78702980 in ??
Apr 10 17:20:15 dhcp-primary sh[6828]: #2 0x7fbe7873e7e1 in ??
Apr 10 17:20:15 dhcp-primary sh[6828]: #3 0x7fbe784e5609 in ??
Apr 10 17:20:15 dhcp-primary sh[6828]: #4 0x7fbe78621103 in ??
nothing in kern.log
apport.log shows
ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: called for pid 6828, signal 6, core limit 0, dump mode 2
ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: not creating core for pid with dump mode of 2
ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: executable: /usr/sbin/dhcpd (command line "dhcpd -user dhcpd -group dhcpd -f -4 -pf /run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf")
ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: is_closing_session(): no DBUS_SESSION_BUS_ADDRESS in environment
ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: wrote report /var/crash/_usr_sbin_dhcpd.0.crash
/var/crash/_usr_sbin_dhcpd.0.crash shows
ProblemType: Crash
Architecture: amd64
CrashCounter: 1
Date: Fri Apr 10 17:20:15 2020
DistroRelease: Ubuntu 20.04
ExecutablePath: /usr/sbin/dhcpd
ExecutableTimestamp: 1586210315
ProcCmdline: dhcpd -user dhcpd -group dhcpd -f -4 -pf /run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf
ProcEnviron: Error: [Errno 13] Permission denied: 'environ'
ProcMaps: Error: [Errno 13] Permission denied: 'maps'
ProcStatus:
Name: dhcpd
Umask: 0022
State: D (disk sleep)
Tgid: 6828
Ngid: 0
Pid: 6828
PPid: 1
TracerPid: 0
Uid: 113 113 113 113
Gid: 118 118 118 118
FDSize: 128
Groups:
NStgid: 6828
NSpid: 6828
NSpgid: 6828
NSsid: 6828
VmPeak: 236244 kB
VmSize: 170764 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 12064 kB
VmRSS: 12064 kB
RssAnon: 5940 kB
RssFile: 6124 kB
RssShmem: 0 kB
VmData: 30792 kB
VmStk: 132 kB
VmExe: 592 kB
VmLib: 5424 kB
VmPTE: 76 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 1
THP_enabled: 1
Threads: 4
SigQ: 0/7609
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000001000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: 3
Cpus_allowed_list: 0-1
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000
0000,00000000,00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 111
nonvoluntary_ctxt_switches: 144
Signal: 6
Uname: Linux 5.4.0-21-generic x86_64
UserGroups:
To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions
More information about the foundations-bugs
mailing list