[Bug 1354114] Re: multipath segmentation Fault (libmultipath: update waiter handling)
Rafael David Tinoco
rafael.tinoco at canonical.com
Thu Aug 7 20:06:47 UTC 2014
It looks like the fix above introduces regressions (actually other new
bugs):
commit 96f81469ff993b6063bb8829d9b336590510466d
Author: Hannes Reinecke <hare at suse.de>
Date: Mon May 4 16:46:58 2009 +0200
libmultipath: update waiter handling
The current 'waiter' structure accesses fields which belong
to the main 'mpp' structure, which has a totally different
lifetime. With this patch most of these dependencies are
removed and the 'waiter' structure can run independently
of the main 'mpp' structure, reducing the risk of
use-after-free faults.
Signed-off-by: Hannes Reinecke <hare at suse.de>
Introduces this problem:
commit c301a3f09203edf91df5a9adf4e32ea2a7238cda
Author: Hannes Reinecke <hare at suse.de>
Date: Wed May 25 14:40:19 2011 +0200
Race condition when calling stop_waiter_thread()
We cannot access the waiter structure from other threads as
the lifetime is totally different and it might be deleted
at any time.
So we better store the pthread id in the calling thread and
just send a signal to the thread.
References: bnc#642846
Signed-off-by: Hannes Reinecke <hare at suse.de>
** Patch removed: "utopic_multipath-tools_0.4.9-3ubuntu9.debdiff"
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172185/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff
** Patch removed: "trusty_multipath-tools_0.4.9-3ubuntu8.debdiff"
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172184/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff
** Patch removed: "precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff"
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172183/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1354114
Title:
multipath segmentation Fault (libmultipath: update waiter handling)
Status in “multipath-tools” package in Ubuntu:
Confirmed
Bug description:
[Impact]
* Multipath can cause segmentation fault due to wrong code and can
possibly cause user to loose access to multipath devices.
[Test Case]
* Working on it.
[Regression Potential]
* Fix based on upstream code (96f8146) Tag 0.5.0 already functioning.
* Introducing mutex, logic to deal with already dead pthread and other
way to access same data (instead of accessing other time lived
structure).
[Other Info]
* Original bug description:
----------------
It was brought to me (~inaddy) the following situation with
multipathd:
#####
Program terminated with signal 6, Aborted.
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
(gdb) bt
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
#1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/
libc.so.6
#2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/
libc.so.6
#3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/
libc.so.6
#4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/
libpthread.so.0
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
#6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204
#7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/
libpthread.so.0
#8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/
libc.so.6
#9 0x0000000000000000 in ?? ()
--------------------------------------------------------------------------------------------
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
44 lock(wp>
vecs>
lock);
(gdb) print wp>
vecs>
lock
$1 = {mutex = 0x168c280, depth = 1}
In pthread_mutex_lock.c:62 there's an assert that fails:
#4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62
62 assert (mutex>_
data._owner == 0);
In this run:
(gdb) p *wp>
vecs>
lock>
mutex
$3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}},
__size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1}
so __owner is 49 and not 0.
Note that 49 is somewhat strange; it's expected to be a pid_t obtained via
pid_t id = THREAD_GETMEM (THREAD_SELF, tid);
According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this
assert failure could be an expected behaviour if, for some reason the
multipath code was trying to release a mutex that has already been
freed.
The multipath-tools package is up to date (0.4.9-3ubuntu5)
I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be
http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860
#####
In between Precise's version and Upstream there are the following
patches touching waiter.c:
d887f4a = signal waiter thread to stop waiting on dm events
5ee9f71 = simplify multipath signal handlers
af4fd6d = Fix race condition in stop_waiter_thread()
e1fcc59 = multipath: clean up code for stopping the waiter threads
03ec4ef = multipath: fix shutdown crashes
4dfdaf2 = multipath: Update multipath device on show topology
c301a3f = Race condition when calling stop_waiter_thread()
96f8146 = libmultipath: update waiter handling
This specific one: 96f8146 (libmultipath: update waiter handling)
"""
The current 'waiter' structure accesses fields which belong
to the main 'mpp' structure, which has a totally different
lifetime.
"""
Shows that due to different lifetime between different structures,
there can be use-after-free segfaults (what seems to be happening).
waiter.c:44 = lock(wp->vecs->lock);
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+subscriptions
More information about the foundations-bugs
mailing list