[Bug 1535898] Re: Trusty & Vivid multipath-tools (multipathd) seg-fault core dump
Mathieu Trudel-Lapierre
mathieu.tl at gmail.com
Thu Jan 21 13:33:27 UTC 2016
And now that I did some more testing with Louis on this, we were able to
"run into" a crash with mpp->alias attempted to be freed but failing,
which isn't quite the same backtrace as I had pasted earlier. It does
look like it might be similar to the issue reported by valgrind (depends
largely on the presence of the debug symbols).
Pending further testing, but I've prepared the attached debdiff, which
should address the state of mpp->alias.
** Patch added: "deal with mpp->alias being allocated wrong"
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1535898/+attachment/4554077/+files/multipath-tools_alias_free.debdiff
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1535898
Title:
Trusty & Vivid multipath-tools (multipathd) seg-fault core dump
Status in multipath-tools package in Ubuntu:
Incomplete
Bug description:
We have a problem on multipath-tools.
Usually after a path removal and a re-scan, the multipathd process
dies.
I created 2 hosts:
iscsi-server
iscsi-client
With 4 NICs in between them and with a simple multibus multipath. With
that I was able to check that there is a regression in multipath-
tools.
It looks like the patches brought from upstream:
0017-multipath-get-right-sysfs-value-for-checker_timeout.patch
0018-multipath-handle-offlined-paths.patch
#
# from here
#
0019-multipath-fix-scsi-timeout-code.patch
0020-multipath-make-tgt_node_name-work-for-iscsi-devices.patch
0021-multipath-cleanup-dev_loss_tmo-issues.patch
0022-Fix-for-setting-0-to-fast_io_fail.patch
0023-Fix-fast_io_fail-capping.patch
0024-multipath-enable-getting-uevents-through-libudev.patch
0025-Use-devpath-as-argument-for-sysfs-functions.patch
0026-multipathd-remove-references-to-sysfs_device.patch
0027-multipathd-use-struct-path-as-argument-for-event-pro.patch
0028-Add-global-udev-reference-pointer-to-config.patch
0029-Use-udev-enumeration-during-discovery.patch
0030-use-struct-udev_device-during-discovery.patch
0031-More-debugging-output-when-synchronizing-path-states.patch
0032-Use-struct-udev_device-instead-of-sysdev.patch
0033-discovery-Fixup-cciss-discovery.patch
0035-Use-udev-devices-during-discovery.patch
0036-Remove-all-references-to-hand-craftes-sysfs-code.patch
#
# to here
#
# 0037-multipath-libudev-cleanup-and-bugfixes.patch
# 0038-multipath-check-if-a-device-belongs-to-multipath.patch
# 0039-multipath-and-wwids_file-multipath.conf-option.patch
# 0040-multipath-Check-blacklists-as-soon-as-possible.patch
# 0041-add-wwids-file-cleanup-options.patch
# 0042-add-find_multipaths-option.patch
# 0043-alloc-keywords.patch
# lp1503305_libmultipath_info_on_1st_path_down_dbd131e.patch
In the range 19-36 caused a regression.
Whenever I generate the package (for trusty) including those patches
I'm able to generate a core dump indicating a possible double-free or
null-dereference related to a path removal (that is why I can
reproduce with the test case). Unfortunately it usually explodes
inside malloc() or somewhere in glibc.
Using valgrind I was able to verify some free() errors:
==30415== Invalid free() / delete / delete[] / realloc()
==30415== at 0x4C2BDEC: free (vg_replace_malloc.c:473)
==30415== by 0x54E243C: vector_del_slot (vector.c:95)
==30415== by 0x550A516: _remove_map (structs_vec.c:139)
==30415== by 0x550A5C3: _remove_maps (structs_vec.c:170)
==30415== by 0x550A64B: remove_maps (structs_vec.c:181)
==30415== by 0x40713F: configure (main.c:1153)
==30415== by 0x407A74: child (main.c:1419)
==30415== by 0x40837D: main (main.c:1618)
And they are exactly aligned to a core dump (multipathd) I got from
another user. (wrong free was coming from _remove_map).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1535898/+subscriptions
More information about the foundations-bugs
mailing list