Packaging policy discussion: After=network-online.target

Christopher James Halse Rogers raof at ubuntu.com
Fri May 14 00:07:39 UTC 2021



On Thu, May 13 2021 at 17:34:58 +0100, Dimitri John Ledkov 
<dimitri.ledkov at canonical.com> wrote:
> On Thu, May 13, 2021 at 4:12 PM Steve Langasek
> <steve.langasek at ubuntu.com> wrote:
>> 
>>  Hi there,
>> 
>>  On Wed, May 12, 2021 at 05:52:07PM +1000, Christopher James Halse 
>> Rogers wrote:
>>  > There's an nfs-utils SRUĀ¹ hanging around waiting for a policy 
>> decision on
>>  > use of the After=network-online.target systemd unit dependency. 
>> I'm not an
>>  > expert here, but it looks like part of my SRU rotation today is 
>> starting the
>>  > discussion on this so we can resolve it one way or another!
>> 
>>  > I am not an expert in this area, but as I understand it, the 
>> tradeoff here
>>  > is:
>>  > 1. Without a dependency on After=network-online.target there is 
>> no guarantee
>>  > that the network interface(s) will be usable at the time the 
>> nfs-utils unit
>>  > triggers, and nfs-utils will fail if the relevant ntwork 
>> interface is not
>>  > usable, or
>>  > 2. With a dependency on After=network-online.target nfs-utils 
>> will reliably
>>  > start, but if there are any interfaces which are configured but 
>> do not come
>>  > up this will result in the boot hanging until the timeout is hit.
>> 
>>  > In mitigation of (2), there are apparently a number of default 
>> packages
>>  > which already have a dependency on After=network-online.target, 
>> so boot
>>  > hanging if interfaces are down is the status quo?
>> 
>>  From one of the comments in the bug report, I gathered that systemd 
>> upstream
>>  (who, specifically?) was taking a position that distributions 
>> should not use
>>  After=network-online.target.  I think this is entirely unhelpful; 
>> the target
>>  exists for this purpose, it is not required for systemd internally 
>> to get
>>  the system up but exists only for other services to depend on.
>> 
>>  There are risks of services not starting on boot because the 
>> network-online
>>  target is not reached.  However, that is not the same thing as a 
>> "hung
>>  boot", because other services will still start on their own, and 
>> things like
>>  gdm and tty don't depend on network-online.target, *unless* you're 
>> in a
>>  situation where you've introduced a dependency between the 
>> filesystem and
>>  network-online.  This is possible when we're talking about nfs, 
>> because the
>>  same system might both export nfs filesystems and mount them from 
>> localhost.
>>  But I'm not sure it should block this specific change.
>> 
>>  > The obvious thing to do here would be to follow Debian, but as 
>> far as I can
>>  > tell there is not currently a Debian policy about this - the best 
>> I can find
>>  > is an ancient draft of a best-practises-guideĀ² suggesting that 
>> pacakages
>>  > SHOULD handle networking dynamically, but if they do not MUST 
>> have a
>>  > dependency on After=network-online.target
>> 
>>  > As far I understand it, handling networking dynamically requires 
>> upstream
>>  > code changes (although maybe fairly simple code changes?).
>> 
>>  It does require upstream code changes; not always simple.  And it's 
>> not
>>  always *correct* to make upstream code changes instead of simply 
>> starting
>>  the service when the system is "online"; you can find a number of 
>> examples
>>  in Ubuntu of services that it only makes sense to start once your 
>> network is
>>  "up" - e.g. apt-daily.service, update-notifier, whoopsie, ...
>> 
>> 
>>  There are issues with the network-online target, to be sure.  There 
>> is not a
>>  clear definition of the target, and there have definitely been
>>  implementation bugs in what does/does not block the target.  I've 
>> had
>>  discussions with the Foundations Team in the past about this but it 
>> has yet
>>  to result in a specification.
>> 
>>  My working definition of what network-online.target SHOULD mean is:
>> 
>>   - at least one interface is up, with routes
>>   - all interfaces which are 'optional: no' (netplan sense) are up
>>     - including completion of ipv6 RA and ipv4 link-local if enabled 
>> on the
>>       interface
>>   - there is a default route for at least one configured address 
>> family
>>   - attempts to discover default routes for other configured address 
>> families
>>     have completed (success or fail)
>>   - DNS is configured
>> 
>>  Thinks that must not block the network-online target:
>>   - interfaces that are marked 'optional: yes'
>>   - address sources that are listed in 'optional-addresses' for an 
>> interface
>>   - default route for an address family for which no interfaces have
>>     addresses
>> 
>>  At least historically, neither networkd nor NetworkManager has 
>> fulfilled
>>  this definition.  It would be nice to get there, but the first step 
>> is
>>  having some agreed definition such as the above so that we can treat
>>  deviations as bugs.
>> 
> 
> If netplan.io can implement that would be nice. I.e. either
> synthetically (i.e. by generating a service unit on the fly that calls
> systemd-networkd-wait-online with extra arguments specifying all the
> non-optional interfaces) , or by creating a new binary which is
> "netplan-wait-online" which will be wanted by network-online.target
> and perform all of the above.
> 
>>  > It seems unlikely that, whatever we decide, we'll immediately do 
>> a full
>>  > sweep of the archive and fix everything, so it looks like our 
>> choice is
>>  > between:
>> 
>>  > 1. The long-term goal is to have no After=network-online.target 
>> dependencies
>>  > in default boot (stretch goal: in main). Whenever we run into a
>>  > package-fails-if-network-is-not-yet-up bug, we patch the code and 
>> submit
>>  > upstream. Over time we audit existing users of 
>> After=network-online.target
>>  > and patch them for dynamic networking, as time permits.
>> 
>>  > 2. We don't expect to be able to reach no 
>> After=network-online.target
>>  > dependencies in the default boot, so it's not a priority to avoid 
>> them.
>>  > Whenever we run into a package-fails-if-network-is-not-yet-up 
>> bug, we add an
>>  > After=network-online.target dependency.
>> 
>>  3.  We expect to reach network-online.target in the common case, 
>> but accept
>>  that there are systems for which it will ordinarily not be reached 
>> on boot
>>  (i.e. offline systems).  Services which depend on 
>> network-online.target
>>  should be those which it is reasonable to not start if the system 
>> is not
>>  connected to the Internet.  This includes systems that are 
>> connected to a
>>  local network, but have no default route.
>> 
> 
> So from my point of view a short term fix of like having
> After=network-online.target or even
> 
> [Unit]
> After=systemd-resolved.service
> [Service]
> ExecStartPre=-/lib/systemd/systemd-networkd-wait-online --any 
> --timeout 30
> 
> Is fine to be SRUed.
> 
> However, I still have the same question - what if network connectivity
> drops & gets re-established? Should we bounce the
> network-online.target (aka restart it)? We can declare for units to be
> restarted, when network-online.target is restarted, if they otherwise
> themselves are incapable to dynamically detect networking loss &
> networking resumption.

Hah! I've actually received a reply off-list relevant to this. They 
found network-online.target to be unreliable for nfs & xdmcp. 
Apparently because of the spanning tree search by their network 
switches the interface would be briefly available, activating relevant 
systemd unit dependencies, then not work for about 25 seconds.

> 
>> 
>>  If we use this as the standard, it's easy to see that *in principle*
>>  nfs-utils shouldn't depend on there being a route to the global 
>> Internet.
>>  It does, however, at least give us a framework for understanding the
>>  behavior, and for users to modify the behavior if they have 
>> different
>>  requirements.
>> 
>> 
>>  None of this makes it any safer for an SRU, since at the end of the 
>> day if
>>  users have such a config that is impacted if you set
>>  After=network-online.target for nfs-utils, it would still be a 
>> regression.
>> 
>>  --
>>  Steve Langasek                   Give me a lever long enough and a 
>> Free OS
>>  Debian Developer                   to set it on, and I can move the 
>> world.
>>  Ubuntu Developer                                   
>> https://www.debian.org/
>>  slangasek at ubuntu.com                                     
>> vorlon at debian.org
>>  --
>>  ubuntu-devel mailing list
>>  ubuntu-devel at lists.ubuntu.com
>>  Modify settings or unsubscribe at: 
>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
> 
> 
> 
> --
> Regards,
> 
> Dimitri.
> 
> --
> ubuntu-devel mailing list
> ubuntu-devel at lists.ubuntu.com
> Modify settings or unsubscribe at: 
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel





More information about the ubuntu-devel mailing list