[Bug 1354114] Re: multipath segmentation Fault (libmultipath: update waiter handling)
Rafael David Tinoco
rafael.tinoco at canonical.com
Thu Aug 7 20:07:57 UTC 2014
So probably to last commits touching libmultipath/waiter.c (and fixing
this issue) would be:
1)
commit e1fcc5933ac44683cdee1a02304e1115abec3ff5
Author: Benjamin Marzinski <bmarzins at redhat.com>
Date: Sat May 19 01:37:03 2012 -0500
multipath: clean up code for stopping the waiter threads
The way multipathd currently stops the waiter threads needs some work.
Right now they are stopped by being sent the SIGUSR1 signal. However their
cleanup code assumes that they are being cancelled, just like all the other
threads are. There's no reason for them to be so unnecessarily
complicated and different from the other threads
This patch does a couple of things. First, it removes the mutex from
the event_thread. This wasn't doing anything. It was designed to protect
the wp->mapname variable, which the waiter threads were checking to see
if they should quit. However, the mutex was only ever being used by the
thread itself, and it clearly didn't need to serialize with itself. Also,
the function to clear the mapname, signal_waiter(), was set with
pthread_cleanup_push(), which never got called early, since the threads
weren't being cancelled. Thus, the mapname never got cleared
until the pthreads were about to shut down.
The patch also rips out all the signal stopping code, and just uses
pthread_cancel. There already are cancellation points in the waiter
thread code. Between the cancellation points, both explicit and implicit,
and the fact that the waiter threads will never be killed except when the
killer is holding the vecs lock, there shouldn't be any place where the
waiter thread can access freed data.
To make sure the waiter thread cleans itself up properly, the dmt
has been moved into the event_thread structure, and is destroyed in
free_waiter() if necessary.
Signed-off-by: Benjamin Marzinski <bmarzins at redhat.com>
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1354114
Title:
multipath segmentation Fault (libmultipath: update waiter handling)
Status in “multipath-tools” package in Ubuntu:
Confirmed
Bug description:
[Impact]
* Multipath can cause segmentation fault due to wrong code and can
possibly cause user to loose access to multipath devices.
[Test Case]
* Working on it.
[Regression Potential]
* Fix based on upstream code (96f8146) Tag 0.5.0 already functioning.
* Introducing mutex, logic to deal with already dead pthread and other
way to access same data (instead of accessing other time lived
structure).
[Other Info]
* Original bug description:
----------------
It was brought to me (~inaddy) the following situation with
multipathd:
#####
Program terminated with signal 6, Aborted.
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
(gdb) bt
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
#1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/
libc.so.6
#2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/
libc.so.6
#3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/
libc.so.6
#4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/
libpthread.so.0
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
#6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204
#7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/
libpthread.so.0
#8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/
libc.so.6
#9 0x0000000000000000 in ?? ()
--------------------------------------------------------------------------------------------
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
44 lock(wp>
vecs>
lock);
(gdb) print wp>
vecs>
lock
$1 = {mutex = 0x168c280, depth = 1}
In pthread_mutex_lock.c:62 there's an assert that fails:
#4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62
62 assert (mutex>_
data._owner == 0);
In this run:
(gdb) p *wp>
vecs>
lock>
mutex
$3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}},
__size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1}
so __owner is 49 and not 0.
Note that 49 is somewhat strange; it's expected to be a pid_t obtained via
pid_t id = THREAD_GETMEM (THREAD_SELF, tid);
According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this
assert failure could be an expected behaviour if, for some reason the
multipath code was trying to release a mutex that has already been
freed.
The multipath-tools package is up to date (0.4.9-3ubuntu5)
I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be
http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860
#####
In between Precise's version and Upstream there are the following
patches touching waiter.c:
d887f4a = signal waiter thread to stop waiting on dm events
5ee9f71 = simplify multipath signal handlers
af4fd6d = Fix race condition in stop_waiter_thread()
e1fcc59 = multipath: clean up code for stopping the waiter threads
03ec4ef = multipath: fix shutdown crashes
4dfdaf2 = multipath: Update multipath device on show topology
c301a3f = Race condition when calling stop_waiter_thread()
96f8146 = libmultipath: update waiter handling
This specific one: 96f8146 (libmultipath: update waiter handling)
"""
The current 'waiter' structure accesses fields which belong
to the main 'mpp' structure, which has a totally different
lifetime.
"""
Shows that due to different lifetime between different structures,
there can be use-after-free segfaults (what seems to be happening).
waiter.c:44 = lock(wp->vecs->lock);
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+subscriptions
More information about the foundations-bugs
mailing list