[Bug 1699850] Re: Reliable network connectivity for apt-daily

Julian Andres Klode julian.klode at gmail.com
Wed Jul 12 21:15:55 UTC 2017


Also note that getaddrinfo() is blocking, so that's not really going to
work.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to apt in Ubuntu.
https://bugs.launchpad.net/bugs/1699850

Title:
  Reliable network connectivity for apt-daily

Status in systemd:
  Unknown
Status in apt package in Ubuntu:
  Triaged

Bug description:
  [Impact]

  apt-daily.service is launched by a timer that depends on network-
  online.target (after the fixes for bug 1686470 are in everywhere)

  At boot that is mostly sufficient for it to have network online, but
  it does not seem to work all the time, and we might be disagreeing
  with network-manager and friends what online state means.

  At resume time, network-online.target is still active, so the service
  is started as soon as possible when it tries to catch up. Depending on
  the timing, the network connectivity might not be there yet, and it
  will fail and only retry 12 hours later.

  [Proposed solution]
  Introduce a new apt-helper wait-online that tries to connect() to remote hosts specified in sources.list until one connection works or a TIMEOUT is reached. The proposed algorithm looks something like this:

  while (time elapsed < TIMEOUT):
    for each entry:
      host = getaddrinfo()
      if host failed:
        continue
      fd = connect to it
      if fd is invalid:
        continue

      all fds += fd

      if poll(all fds, 100 ms timeout) finds a connected one:
        exit(0)

  exit(42) # timeout

  There are two things to consider:
  * getaddrinfo() and connect() may fail if network is not up yet, so we need to retry (we might need to sleep somewhere)
  * If poll() fails, we likely sleep enough, so no extra sleep needed.

  I believe the time out should be something like 30s.

  On the systemd service side, we add:
    ExecStartPre=/usr/lib/apt/apt-helper wait-online
    RestartForceExitStatus=42
    RestartSec=15m

  To retry the service after 15 minutes.

  [Test case]
  * Start apt-daily.service after turning off network -> It should wait (in ExecStartPre)
  * Turn on network -> apt-daily.service should start

  [Regression potential]
  There might be increased I/O activity after resume, if that did not work before.

To manage notifications about this bug go to:
https://bugs.launchpad.net/systemd/+bug/1699850/+subscriptions



More information about the foundations-bugs mailing list