[Bug 1670811] Re: Multipath services fails to start on Ubuntu 17.04 on boot and kdump (initramfs)

Mauricio Faria de Oliveira mauricfo at linux.vnet.ibm.com
Thu Mar 30 16:25:37 UTC 2017


> 4. a few mods to your patch (see below)

Thanks for the careful review :)

>  In the patch I think it overall makes sense to harden/improve the
shutdown - even not being able to reproduce yet, here some patch
feedback:

> +pid="$(pidof multipathd)"

> I think this should get some safety if this does not return a pid.
> There might be reasons to do so, and these should skip the whole shutdown part.

If I understood it correctly, it does.

And the whole shutdown is skipped in case 'multipathd -kshutdown' failed 
and PID is null (as kill_stage does not progress into 1,2,3)

+if [ "$out" = 'ok' ] \
+|| ( [ -n "$pid" ] && /bin/kill -SIGINT  $pid ) \
+|| ( [ -n "$pid" ] && /bin/kill -SIGKILL $pid ); then
+	kill_stage=1
+fi

Maybe I'm missing something in your point?

> Also there might be some theoretical cases where the
> +out="$(/sbin/multipathd -k'shutdown')"
> As a side effect could make the pid change, so please re-arrange the pidof and the shutdown.

I'm afraid I don't see that theoretical case happening.
Can you please elaborate a bit more on that, for my own education?

As far as I see, the -k switch just connects to the multipathd socket
(which should be running from the daemonized multipathd instance),
sends the argument/command to it, and exit.

multipathd/main.c::main()

                case 'k':
                        conf = load_config(DEFAULT_CONFIGFILE);
                        if (!conf)
                                exit(1);
                        if (verbosity)
                                conf->verbosity = verbosity;
                        uxclnt(optarg, uxsock_timeout + 100);
                        exit(0);


And the uxclnt() call tries to connect to the socket (i.e., implies
multipathd already running), and if it fails (e.g., multipathd not
running), it just exits -- doesn't try to restart it / PID change.

So I'm not sure how that could happen.


> BTW - Is the shutdown synchronous?

No, it's async. Any command from 'multipathd -k' (cli instance) 
is sent to multipathd (daemonized instance) via unix socket, 
and the cli instance immediately exits. 
Then the daemonized instance handles it there.

The shutdown, for example, gets 'running_state' set to 
DAEMON_SHUTDOWN, and that is broadcast to other pthreads.

The one which is running child() will take it and proceed
into free/unload/shutdown.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1670811

Title:
  Multipath services fails to start on Ubuntu 17.04 on boot and kdump
  (initramfs)

Status in multipath-tools package in Ubuntu:
  Fix Released
Status in multipath-tools source package in Zesty:
  Fix Released

Bug description:
  ---Problem Description---
  Multipath services fails to start on Ubuntu 17.04 with SAN multipath devices.
                                                                                     
  root at ltciofvtr-s824-lp8:~# service multipath-tools status
  * multipathd.service - Device-Mapper Multipath Device Controller
     Loaded: loaded (/lib/systemd/system/multipathd.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Tue 2017-03-07 07:00:43 CST; 5min ago
   Main PID: 690 (code=exited, status=1/FAILURE)

  Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: Starting Device-Mapper Multipath Device Controller...
  Mar 07 07:00:43 ltciofvtr-s824-lp8 multipathd[690]: process is already running
  Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Main process exited, code=exited, status=1/FAILURE
  Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: Failed to start Device-Mapper Multipath Device Controller.
  Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Unit entered failed state.
  Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Failed with result 'exit-code'.
   
  ---uname output---
  Linux ltciofvtr-s824-lp8 4.10.0-8-generic #10-Ubuntu SMP Mon Feb 13 14:00:06 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux  [12:24]   good, passe
   
  Machine Type = IBM,8286-42A LPAR 
    
  ---Steps to Reproduce---
   # service multipath-tools status
   
  There are failures to start the multipathd.socket unit, which reports the socket address busy [1].
  And also are failures to start the multipath.service unit, which reports already running [2].

  Those correlate to a failure to stop multipathd at the initramfs-tools local-bottom hook,
  becasue the pid file changed from /var/run to /run, due to a build-time evaluation in multipath-tools' Makefile.inc,
  that apparently is evaluated differently on 17.04 or has changed with the recent merge with Debian. [3, 4]

  So, let's make the stop/shutdown/kill more independent of the initramfs filesystem structure.
  Since we can assume multipathd is running, use its shutdown command.
  And if that fails, try SIGINT.
  And if that fails, use SIGKILL.

  A fallout of this is that sometimes multipathd takes a while to handle the non-KILL methods,
  and once multipathd.socket was started very quickly afterward, it failed because the 
  unix socket of the initramfs multipathd was still open.

  So, wait a little for it to close (almost always it happens immediately, but keep a 10-sec retry/timeout handler there,
  just ensure the shutdown a bit more, since the impact of not shutting down correctly is not having multipathd started at the rootfs/systemd units, which is important as /etc/fstab might point/wait to mpath devices, and if those are not availble, the local-fs unit fails, and puts the system into the rescue/emergency shell.

  
  1) multipathd.socket

  [   36.465332] systemd[1]: Failed to listen on multipathd control socket.
  [FAILED] Failed to listen on multipathd control socket.
  See 'systemctl status multipathd.socket' for details.

  # systemctl status multipathd.socket --no-pager -l
  * multipathd.socket - multipathd control socket
     Loaded: loaded (/lib/systemd/system/multipathd.socket; static; vendor preset: enabled)
     Active: failed (Result: resources)
     Listen: @/org/kernel/linux/storage/multipathd (Stream)

  # journalctl -b -x
  ...
  Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.socket: Failed to listen on sockets: Address already in use
  Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: Failed to listen on multipathd control socket.
  Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.socket: Unit entered failed state.
  ...

  2) multipathd.service

  [FAILED] Failed to start Device-Mapper Multipath Device Controller.
  See 'systemctl status multipathd.service' for details.

  Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: Starting Device-Mapper Multipath Device Controller...
  Mar 07 11:02:04 ltciofvtr-s824-lp8 multipathd[861]: process is already running
  Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Main process exited, code=exited, status=1/FAILURE
  Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: Failed to start Device-Mapper Multipath Device Controller.
  Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Unit entered failed state.
  Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Failed with result 'exit-code'.

  [3] at the local-bottom hook, break=post-multipath:

  (initramfs) multipathd -k'show daemon'
  pid 259 idle

  (initramfs) ls /var/run/multipathd.pid
  ls: /var/run/multipathd.pid: No such file or directory

  (initramfs) ls /run/multipathd.pid
  /run/multipathd.pid

  (initramfs) exit
  cat: can't open '/var/run/multipathd.pid': No such file or directory

  [4] bulid time evaluation in Makefile.inc

  build time evaluation:

  ifndef RUN
          ifeq ($(shell test -L /var/run -o ! -d /var/run && echo 1),1)
                  RUN=run
          else
                  RUN=var/run
          endif
  endif

  
  Attaching the patch for multipath-tools on 17.04 that resolves this problem.

  With it applied, the multipathd.socket and .service units started
  successfully on both normal boot and kdump boot scenarios.

  Can you please consider it for Zesty?  Thanks

  # systemctl status multipathd.socket | head -n4
  * multipathd.socket - multipathd control socket
     Loaded: loaded (/lib/systemd/system/multipathd.socket; static; vendor preset: enabled)
     Active: active (running) since Tue 2017-03-07 12:37:54 CST; 30s ago
     Listen: @/org/kernel/linux/storage/multipathd (Stream)

  # systemctl status multipathd.service | head -n3
  * multipathd.service - Device-Mapper Multipath Device Controller
     Loaded: loaded (/lib/systemd/system/multipathd.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2017-03-07 12:37:55 CST; 42s ago

  @taco-screen-team

  May you please assign this bug to @cyphermox or @paelzer ?

  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1670811/+subscriptions



More information about the foundations-bugs mailing list