[Bug 1354114] Re: multipath segmentation Fault (libmultipath: update waiter handling)

Rafael David Tinoco rafael.tinoco at canonical.com
Fri Aug 8 20:31:19 UTC 2014


** Patch removed: "utopic_multipath-tools_0.4.9-3ubuntu9.debdiff"
   https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172259/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff

** Patch removed: "precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff"
   https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172257/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff

** Patch removed: "trusty_multipath-tools_0.4.9-3ubuntu8.debdiff"
   https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172258/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff

** Patch added: "precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff"
   https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4173051/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1354114

Title:
  multipath segmentation Fault (libmultipath: update waiter handling)

Status in “multipath-tools” package in Ubuntu:
  Confirmed
Status in “multipath-tools” source package in Precise:
  New
Status in “multipath-tools” source package in Trusty:
  New

Bug description:
  [Impact]

   * Multipath can cause segmentation fault due to wrong code and can
     possibly cause user to loose access to multipath devices.

  [Test Case]

   * To use multipath and wait for the problem to occur sometime
  (inevitable).

  [Regression Potential]

   * Patch 1/4 tries to fix the issue. Patch 2/4 fixes the 1/4.
   * Patch 3/4 discovers 1/4 was no good. Patch 4/4 fixes 3/4.

   * Fix based on upstream code (96f8146) + subsequent patches.
   * Followed this code development until the issue was addressed.

  [Other Info]

   * Original bug description:

  ----------------

  It was brought to me (~inaddy) the following situation with
  multipathd:

  #####

  Program terminated with signal 6, Aborted.
  #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
  libc.so.6
  (gdb) bt
  #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
  libc.so.6
  #1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/
  libc.so.6
  #2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/
  libc.so.6
  #3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/
  libc.so.6
  #4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/
  libpthread.so.0
  #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
  #6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204
  #7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/
  libpthread.so.0
  #8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/
  libc.so.6
  #9 0x0000000000000000 in ?? ()

  --------------------------------------------------------------------------------------------
  #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
  44 lock(wp>
  vecs>
  lock);
  (gdb) print wp>
  vecs>
  lock
  $1 = {mutex = 0x168c280, depth = 1}
  In pthread_mutex_lock.c:62 there's an assert that fails:
  #4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62
  62 assert (mutex>_
  data._owner == 0);
  In this run:
  (gdb) p *wp>
  vecs>
  lock>
  mutex
  $3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}},
  __size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1}
  so __owner is 49 and not 0.
  Note that 49 is somewhat strange; it's expected to be a pid_t obtained via
  pid_t id = THREAD_GETMEM (THREAD_SELF, tid);

  According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this
  assert failure could be an expected behaviour if, for some reason the
  multipath code was trying to release a mutex that has already been
  freed.

  The multipath-tools package is up to date (0.4.9-3ubuntu5)

  I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be
  http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860

  #####

  In between Precise's version and Upstream there are the following
  patches touching waiter.c:

  d887f4a = signal waiter thread to stop waiting on dm events
  5ee9f71 = simplify multipath signal handlers
  af4fd6d = Fix race condition in stop_waiter_thread()
  e1fcc59 = multipath: clean up code for stopping the waiter threads
  03ec4ef = multipath: fix shutdown crashes
  4dfdaf2 = multipath: Update multipath device on show topology
  c301a3f = Race condition when calling stop_waiter_thread()
  96f8146 = libmultipath: update waiter handling

  This specific one: 96f8146 (libmultipath: update waiter handling)

  """
  The current 'waiter' structure accesses fields which belong
  to the main 'mpp' structure, which has a totally different
  lifetime.
  """

  Shows that due to different lifetime between different structures,
  there can be use-after-free segfaults (what seems to be happening).

  waiter.c:44 = lock(wp->vecs->lock);

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+subscriptions



More information about the Ubuntu-sponsors mailing list