Packaging policy discussion: After=network-online.target

Thu May 13 12:37:14 UTC 2021

On Thu, May 13, 2021 at 12:59:39PM +0100, Robie Basak wrote:
> On Thu, May 13, 2021 at 08:53:28AM -0300, Thadeu Lima de Souza Cascardo wrote:
> > Take kdump-tools, for example. After a system crash, it will reboot into a mode
> > that collects a memory dump and reboot. Some times, it is configured to send
> > the dump through the network. If the network does not come up after some
> > timeout, it should just reboot. And systemd machinery is leveraged to
> > accomplish that. As long as that is what is happening and it's documented, I
> > don't see a reason to not use After=network-online.target.
> 
> What does "if the network does not come up" mean in your case? For
> example, what if multiple networks are defined, or, in the case of a
> laptop, no network is defined? And what if a network comes up, but there
> is nothing on the other end to receive the dump?
> 
> Surely the key thing here is that the dump is transmitted to something
> that can receive it, and if the dump isn't transmitted after a timeout,
> it should reboot? That's not necessarily the same as the network being
> "up" (whatever that might mean), or network-online.target having been
> reached. We wouldn't want to block on the "wrong" network being up,
> either, if the "right" network is already up and ready to receive the
> dump.

Good points. Which leads back to the program at hand, kdump-tools, implementing
network monitoring on its own. Which is not as simple as querying for the right
interface, because we do not care about interfaces.

kdump-tools already does some retries for the case the other end is not
reachable for some reason.

Maybe what needs to be provided here to avoid reimplementations is a better
method to let services be more specific about what networks/addresses they care
about. That is, if we care about not having to reimplement this logic
everywhere.

Or I may be overthinking this, and we should just slightly increase the number
of retry attempts and the total timeout.

Cascardo.