Packaging policy discussion: After=network-online.target

Dimitri John Ledkov dimitri.ledkov at canonical.com
Thu May 13 16:34:58 UTC 2021


On Thu, May 13, 2021 at 4:12 PM Steve Langasek
<steve.langasek at ubuntu.com> wrote:
>
> Hi there,
>
> On Wed, May 12, 2021 at 05:52:07PM +1000, Christopher James Halse Rogers wrote:
> > There's an nfs-utils SRUĀ¹ hanging around waiting for a policy decision on
> > use of the After=network-online.target systemd unit dependency. I'm not an
> > expert here, but it looks like part of my SRU rotation today is starting the
> > discussion on this so we can resolve it one way or another!
>
> > I am not an expert in this area, but as I understand it, the tradeoff here
> > is:
> > 1. Without a dependency on After=network-online.target there is no guarantee
> > that the network interface(s) will be usable at the time the nfs-utils unit
> > triggers, and nfs-utils will fail if the relevant ntwork interface is not
> > usable, or
> > 2. With a dependency on After=network-online.target nfs-utils will reliably
> > start, but if there are any interfaces which are configured but do not come
> > up this will result in the boot hanging until the timeout is hit.
>
> > In mitigation of (2), there are apparently a number of default packages
> > which already have a dependency on After=network-online.target, so boot
> > hanging if interfaces are down is the status quo?
>
> From one of the comments in the bug report, I gathered that systemd upstream
> (who, specifically?) was taking a position that distributions should not use
> After=network-online.target.  I think this is entirely unhelpful; the target
> exists for this purpose, it is not required for systemd internally to get
> the system up but exists only for other services to depend on.
>
> There are risks of services not starting on boot because the network-online
> target is not reached.  However, that is not the same thing as a "hung
> boot", because other services will still start on their own, and things like
> gdm and tty don't depend on network-online.target, *unless* you're in a
> situation where you've introduced a dependency between the filesystem and
> network-online.  This is possible when we're talking about nfs, because the
> same system might both export nfs filesystems and mount them from localhost.
> But I'm not sure it should block this specific change.
>
> > The obvious thing to do here would be to follow Debian, but as far as I can
> > tell there is not currently a Debian policy about this - the best I can find
> > is an ancient draft of a best-practises-guideĀ² suggesting that pacakages
> > SHOULD handle networking dynamically, but if they do not MUST have a
> > dependency on After=network-online.target
>
> > As far I understand it, handling networking dynamically requires upstream
> > code changes (although maybe fairly simple code changes?).
>
> It does require upstream code changes; not always simple.  And it's not
> always *correct* to make upstream code changes instead of simply starting
> the service when the system is "online"; you can find a number of examples
> in Ubuntu of services that it only makes sense to start once your network is
> "up" - e.g. apt-daily.service, update-notifier, whoopsie, ...
>
>
> There are issues with the network-online target, to be sure.  There is not a
> clear definition of the target, and there have definitely been
> implementation bugs in what does/does not block the target.  I've had
> discussions with the Foundations Team in the past about this but it has yet
> to result in a specification.
>
> My working definition of what network-online.target SHOULD mean is:
>
>  - at least one interface is up, with routes
>  - all interfaces which are 'optional: no' (netplan sense) are up
>    - including completion of ipv6 RA and ipv4 link-local if enabled on the
>      interface
>  - there is a default route for at least one configured address family
>  - attempts to discover default routes for other configured address families
>    have completed (success or fail)
>  - DNS is configured
>
> Thinks that must not block the network-online target:
>  - interfaces that are marked 'optional: yes'
>  - address sources that are listed in 'optional-addresses' for an interface
>  - default route for an address family for which no interfaces have
>    addresses
>
> At least historically, neither networkd nor NetworkManager has fulfilled
> this definition.  It would be nice to get there, but the first step is
> having some agreed definition such as the above so that we can treat
> deviations as bugs.
>

If netplan.io can implement that would be nice. I.e. either
synthetically (i.e. by generating a service unit on the fly that calls
systemd-networkd-wait-online with extra arguments specifying all the
non-optional interfaces) , or by creating a new binary which is
"netplan-wait-online" which will be wanted by network-online.target
and perform all of the above.

> > It seems unlikely that, whatever we decide, we'll immediately do a full
> > sweep of the archive and fix everything, so it looks like our choice is
> > between:
>
> > 1. The long-term goal is to have no After=network-online.target dependencies
> > in default boot (stretch goal: in main). Whenever we run into a
> > package-fails-if-network-is-not-yet-up bug, we patch the code and submit
> > upstream. Over time we audit existing users of After=network-online.target
> > and patch them for dynamic networking, as time permits.
>
> > 2. We don't expect to be able to reach no After=network-online.target
> > dependencies in the default boot, so it's not a priority to avoid them.
> > Whenever we run into a package-fails-if-network-is-not-yet-up bug, we add an
> > After=network-online.target dependency.
>
> 3.  We expect to reach network-online.target in the common case, but accept
> that there are systems for which it will ordinarily not be reached on boot
> (i.e. offline systems).  Services which depend on network-online.target
> should be those which it is reasonable to not start if the system is not
> connected to the Internet.  This includes systems that are connected to a
> local network, but have no default route.
>

So from my point of view a short term fix of like having
After=network-online.target or even

[Unit]
After=systemd-resolved.service
[Service]
ExecStartPre=-/lib/systemd/systemd-networkd-wait-online --any --timeout 30

Is fine to be SRUed.

However, I still have the same question - what if network connectivity
drops & gets re-established? Should we bounce the
network-online.target (aka restart it)? We can declare for units to be
restarted, when network-online.target is restarted, if they otherwise
themselves are incapable to dynamically detect networking loss &
networking resumption.

>
> If we use this as the standard, it's easy to see that *in principle*
> nfs-utils shouldn't depend on there being a route to the global Internet.
> It does, however, at least give us a framework for understanding the
> behavior, and for users to modify the behavior if they have different
> requirements.
>
>
> None of this makes it any safer for an SRU, since at the end of the day if
> users have such a config that is impacted if you set
> After=network-online.target for nfs-utils, it would still be a regression.
>
> --
> Steve Langasek                   Give me a lever long enough and a Free OS
> Debian Developer                   to set it on, and I can move the world.
> Ubuntu Developer                                   https://www.debian.org/
> slangasek at ubuntu.com                                     vorlon at debian.org
> --
> ubuntu-devel mailing list
> ubuntu-devel at lists.ubuntu.com
> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel



-- 
Regards,

Dimitri.



More information about the ubuntu-devel mailing list