[Bug 1535898] Re: Trusty & Vivid multipath-tools (multipathd) seg-fault core dump

Nish Aravamudan nish.aravamudan at canonical.com
Wed Jun 7 20:28:06 UTC 2017


Hello, Precise is EOL and we are no longer providing bug-fixes to it. It
would appear this particular issue is fixed in Trusty (the only current
release it is present) -- In Bug 1629644, it was determined this version
did not regress Trusty (a different upload did), and it has since
expired due to inactivity, unfortunately. I am unsubscribing the server
team and marking the precise task as "Won't Fix". Thank you for your
contributions to Ubuntu!

** Changed in: multipath-tools (Ubuntu Precise)
       Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1535898

Title:
  Trusty & Vivid multipath-tools (multipathd) seg-fault core dump

Status in multipath-tools package in Ubuntu:
  Incomplete
Status in multipath-tools source package in Precise:
  Won't Fix
Status in multipath-tools source package in Trusty:
  Fix Released

Bug description:
  [SRU justification]
  Without this patch, multipathd may exit in SEGV in trying to add a map that aleady exists

  [Impact]
  multipathd crashes with SIGSEGV
  A typical trace of such a situation is a message similar to this one in /var/log/syslog :

  multipathd: 360060160164034004cd59cfdb22ce611: failed in domap for
  addition of new path sdr

  [Fix]
  Check if the map already exists and do a RELOAD in domap() instead of failing.

  [Test Case]
  Problem was encountered in a complex Openstack test environment where the following was done :
  A test tool which runs which :
  - first boots a number of virtual machines. 
  - then it creates a number of threads and in each thread it 
  creates volumes, takes snapshots of the volumes, and attaches the volumes to the initially booted virtual machines. After a short while the volumes are detached, and snapshots and volumes are deleted.

  Running this tool overnight normally result in running in the
  multipathd SEGV situation.

  [Regression]
  This is a straight backport of the code being used in 0.5.0. No regression is to be expected.

  It is important to note that the reproducer in the original
  description did not lead to such a problem.

  [Original description of the problem]

  We have a problem on multipath-tools.

  Usually after a path removal and a re-scan, the multipathd process
  dies.

  I created 2 hosts:

  iscsi-server
  iscsi-client

  With 4 NICs in between them and with a simple multibus multipath. With
  that I was able to check that there is a regression in multipath-
  tools.

  It looks like the patches brought from upstream:

  0017-multipath-get-right-sysfs-value-for-checker_timeout.patch
  0018-multipath-handle-offlined-paths.patch
  #
  # from here
  #
  0019-multipath-fix-scsi-timeout-code.patch
  0020-multipath-make-tgt_node_name-work-for-iscsi-devices.patch
  0021-multipath-cleanup-dev_loss_tmo-issues.patch
  0022-Fix-for-setting-0-to-fast_io_fail.patch
  0023-Fix-fast_io_fail-capping.patch
  0024-multipath-enable-getting-uevents-through-libudev.patch
  0025-Use-devpath-as-argument-for-sysfs-functions.patch
  0026-multipathd-remove-references-to-sysfs_device.patch
  0027-multipathd-use-struct-path-as-argument-for-event-pro.patch
  0028-Add-global-udev-reference-pointer-to-config.patch
  0029-Use-udev-enumeration-during-discovery.patch
  0030-use-struct-udev_device-during-discovery.patch
  0031-More-debugging-output-when-synchronizing-path-states.patch
  0032-Use-struct-udev_device-instead-of-sysdev.patch
  0033-discovery-Fixup-cciss-discovery.patch
  0035-Use-udev-devices-during-discovery.patch
  0036-Remove-all-references-to-hand-craftes-sysfs-code.patch
  #
  # to here
  #
  # 0037-multipath-libudev-cleanup-and-bugfixes.patch
  # 0038-multipath-check-if-a-device-belongs-to-multipath.patch
  # 0039-multipath-and-wwids_file-multipath.conf-option.patch
  # 0040-multipath-Check-blacklists-as-soon-as-possible.patch
  # 0041-add-wwids-file-cleanup-options.patch
  # 0042-add-find_multipaths-option.patch
  # 0043-alloc-keywords.patch
  # lp1503305_libmultipath_info_on_1st_path_down_dbd131e.patch

  In the range 19-36 caused a regression.

  Whenever I generate the package (for trusty) including those patches
  I'm able to generate a core dump indicating a possible double-free or
  null-dereference related to a path removal (that is why I can
  reproduce with the test case). Unfortunately it usually explodes
  inside malloc() or somewhere in glibc.

  Using valgrind I was able to verify some free() errors:

  ==30415== Invalid free() / delete / delete[] / realloc()
  ==30415==    at 0x4C2BDEC: free (vg_replace_malloc.c:473)
  ==30415==    by 0x54E243C: vector_del_slot (vector.c:95)
  ==30415==    by 0x550A516: _remove_map (structs_vec.c:139)
  ==30415==    by 0x550A5C3: _remove_maps (structs_vec.c:170)
  ==30415==    by 0x550A64B: remove_maps (structs_vec.c:181)
  ==30415==    by 0x40713F: configure (main.c:1153)
  ==30415==    by 0x407A74: child (main.c:1419)
  ==30415==    by 0x40837D: main (main.c:1618)

  And they are exactly aligned to a core dump (multipathd) I got from
  another user. (wrong free was coming from _remove_map).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1535898/+subscriptions



More information about the foundations-bugs mailing list