[Bug 1872118] Re: [SRU] DHCP Cluster crashes after a few hours

Timo Aaltonen 1872118 at bugs.launchpad.net
Fri Aug 14 09:57:39 UTC 2020


Hello Andrew, or anyone else affected,

Accepted bind9-libs into focal-proposed. The package will build now and
be available at
https://launchpad.net/ubuntu/+source/bind9-libs/1:9.11.16+dfsg-3~ubuntu1
in a few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
focal to verification-done-focal. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-focal. In either case, without details of your testing we will
not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: bind9-libs (Ubuntu Focal)
       Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-focal

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1872118

Title:
  [SRU] DHCP Cluster crashes after a few hours

Status in DHCP:
  New
Status in bind9-libs package in Ubuntu:
  Fix Released
Status in isc-dhcp package in Ubuntu:
  Invalid
Status in bind9-libs source package in Focal:
  Fix Committed
Status in isc-dhcp source package in Focal:
  Invalid
Status in bind9-libs source package in Groovy:
  Fix Released
Status in isc-dhcp source package in Groovy:
  Invalid

Bug description:
  [Description]

  isc-dhcp-server uses libisc-export (coming from bind9-libs package) for handling the socket event(s) when configured in peer mode (master/secondary). It's possible that a sequence of messages dispatched by the master that requires acknowledgment from its peers holds a socket
  in a pending to send state, a timer or a subsequent write request can be scheduled into this socket and the !sock->pending_send assertion
  will be raised when trying to write again at the time data hasn't been flushed entirely and the pending_send flag hasn't been reset to 0 state.

  If this race condition happens, the following stacktrace will be
  hit:

  (gdb) info threads
    Id Target Id Frame
  * 1 Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
    2 Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait (futex=futex at entry=0x7fb4de6d2028, private=0) at lowlevellock.c:52
    3 Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=<optimized out>, processes_to_wake=1, futex_word=<optimized out>) at ../sysdeps/nptl/futex-internal.h:364
    4 Thread 0x7fb4de74f740 (LWP 3148) futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb4de6cd0d0) at ../sysdeps/nptl/futex-internal.h:183

  (gdb) frame 2
  #2 0x00007fb4dec85985 in isc_assertion_failed (file=file at entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line at entry=3361, type=type at entry=isc_assertiontype_insist,
      cond=cond at entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
  (gdb) bt
  #1 0x00007fb4deaa7859 in __GI_abort () at abort.c:79
  #2 0x00007fb4dec85985 in isc_assertion_failed (file=file at entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line at entry=3361, type=type at entry=isc_assertiontype_insist,
      cond=cond at entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
  #3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
  #4 process_fd (writeable=<optimized out>, readable=<optimized out>, fd=11, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4054
  #5 process_fds (writefds=<optimized out>, readfds=0x7fb4de6d1090, maxfd=13, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4211
  #6 watcher (uap=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4397
  #7 0x00007fb4dea68609 in start_thread (arg=<optimized out>) at pthread_create.c:477
  #8 0x00007fb4deba4103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

  (gdb) frame 3
  #3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
  4041 in ../../../../lib/isc/unix/socket.c
  (gdb) p sock->pending_send
  $2 = 1

  [TEST CASE]

  1) Install isc-dhcp-server in 2 focal machine(s).
  2) Configure peer/cluster mode as follows:
     Primary configuration: https://pastebin.ubuntu.com/p/XYj648MghK/
     Secondary configuration: https://pastebin.ubuntu.com/p/PYkcshZCWK/
  2) Run dhcpd as follows in both machine(s)

  # dhcpd -f -d -4 -cf /etc/dhcp/dhcpd.conf --no-pid ens4

  3) Leave the cluster running for a long (2h) period until the
  crash/race condition is reproduced.

  [REGRESSION POTENTIAL]

  * The fix will prevent the assertion to happen in the dispatch_send
  path, later versions of isch-dhcp upstream lack this logic and entirely removed the existence of this flag. Therefore, removing the need for this
  assertion at process_fd shouldn't be problematic.

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions



More information about the Ubuntu-sponsors mailing list