[Bug 1752411] Re: bind9-host, avahi-daemon-check-dns.sh hang forever causes network connections to get stuck

Trent Lloyd trent.lloyd at canonical.com
Fri Aug 24 05:13:25 UTC 2018

I agree with the sentiment that 5 seconds feels too long, however as a
workaround I decided I would just copy the existing timeout. I certainly
would not want to make it longer since this is in the critical boot

I would generally agree that in general a DNS request should fail faster
however there are some cases where it won't, e.g. spanning tree bring up
on ports can take 2 seconds.

My hope is to correctly fix host after getting this in, since the impact
is very high for affected users.

This check may actually be able to go away, I believe both systemd-
resolved and libnss-mdns (latest version that I think is not in bionic)
implement the .local label checking to do this at runtime instead of
this old hack. So for cosmic+ we can probably get rid of this logic,
which always sucked anyway. As we only needed to really disable nss-mdns
and not avahi entirely (since apps should normally resolve the IPs using
avahi's API anyway, the impact to actual avahi usage is low).

Since the impact is high but only on a smaller subset of users, I think
we should go with matching the current timeout for now and worry about
further improvements later.

I've verified the cosmic upload is working as expected on a non-affected

You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.

  bind9-host, avahi-daemon-check-dns.sh hang forever causes network
  connections to get stuck

Status in avahi package in Ubuntu:
  In Progress
Status in bind9 package in Ubuntu:
Status in openconnect package in Ubuntu:
Status in strongswan package in Ubuntu:
Status in avahi source package in Bionic:
Status in bind9 source package in Bionic:
Status in avahi source package in Cosmic:
  In Progress
Status in bind9 source package in Cosmic:
Status in avahi package in Debian:

Bug description:

   * Network connections for some users fail (in some cases a direct
  interface, in others when connecting a VPN) because the 'host' command
  to check for .local in DNS called by /usr/lib/avahi/avahi-daemon-
  check-dns.sh never times out like it should - leaving the script
  hanging indefinitely blocking interface up and start-up. This appears
  to be a bug in host caused in some circumstances however we implement
  a workaround to call it under 'timeout' as the issue with 'host' has
  not easily been identified, and in any case acts as a fall-back.

  [Test Case]

   * Multiple people have been unable to create a reproducer on a
  generic machine (e.g. it does not occur in a VM), I have a specific
  machine I can reproduce it on (a Skull Canyon NUC with Intel I219-LM)
  by simply "ifdown br0; ifup br0" and there are clearly 10s of other
  users affected in varying circumstances that all involve the same
  symptoms but no clear test case exists. Best I can suggest is that I
  test the patch on my system to ensure it works as expected, and the
  change is only 1 line which is fairly easily auditible and

  [Regression Potential]

   * The change is a single line change to the shell script to call host with "timeout". When tested on working and non-working system this appears to function as expected. I believe the regression potential for this is subsequently low.
   * In attempt to anticipate possible issues, I checked that the timeout command is in the same path (/usr/bin) as the host command that is already called without a path, and the coreutils package (which contains timeout) is an Essential package. I also checked that timeout is not a built-in in bash, for those that have changed /bin/sh to bash (just in case).

  [Other Info]
   * N/A

  [Original Bug Description]

  On 18.04 Openconnect connects successfully to any of multiple VPN
  concentrators but network traffic does not flow across the VPN tunnel
  connection. When testing on 16.04 this works flawlessly. This also
  worked on this system when it was on 17.10.

  I have tried reducing the mtu of the tun0 network device but this has
  not resulted in me being able to successfully ping the IP address.

  Example showing ping attempt to the IP of DNS server:

  ~$ cat /etc/resolv.conf
  # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
  # is the systemd-resolved stub resolver.
  # run "systemd-resolve --status" to see details about the actual nameservers.


  liam at liam-lat:~$ netstat -nr
  Kernel IP routing table
  Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface         UG        0 0          0 wlp2s0 UGH       0 0          0 wlp2s0     U         0 0          0 docker0     U         0 0          0 docker0     U         0 0          0 tun0 UH        0 0          0 tun0   U         0 0          0 wlp2s0
  liam at liam-lat:~$ ping
  PING ( 56(84) bytes of data.
  --- ping statistics ---
  4 packets transmitted, 0 received, 100% packet loss, time 3054ms

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: openconnect 7.08-3
  ProcVersionSignature: Ubuntu 4.15.0-10.11-generic 4.15.3
  Uname: Linux 4.15.0-10-generic x86_64
  ApportVersion: 2.20.8-0ubuntu10
  Architecture: amd64
  CurrentDesktop: ubuntu:GNOME
  Date: Wed Feb 28 22:11:33 2018
  InstallationDate: Installed on 2017-06-15 (258 days ago)
  InstallationMedia: Ubuntu 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719)
  SourcePackage: openconnect
  UpgradeStatus: Upgraded to bionic on 2018-02-22 (6 days ago)

To manage notifications about this bug go to:

More information about the Ubuntu-sponsors mailing list