[Bug 1617919] Re: mdadm segfault error 4 in libc-2.23.so

Fri Nov 3 16:27:08 UTC 2017

** Changed in: mdadm (Ubuntu Trusty)
       Status: Triaged => In Progress

** Changed in: mdadm (Ubuntu Xenial)
       Status: Triaged => In Progress

** Tags removed: sts-sponsor-ddstreet
** Tags added: sts-sponsor-ddstreet-done

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1617919

Title:
  mdadm segfault error 4 in libc-2.23.so

Status in mdadm package in Ubuntu:
  Fix Released
Status in mdadm source package in Trusty:
  In Progress
Status in mdadm source package in Xenial:
  In Progress

Bug description:
  [impact]

  the mdadm cron jobs invoke mdadm to scan raid arrays periodically, but
  when inside a unpriviledged container mdadm does not have access to
  the arrays, and it segfaults when invoked.  This is logged in the host
  system's logs, and while harmless, causes confusion about mdadm
  segfaults in the host logs.

  [test case]

  install a ubuntu system and create one or more raid/mdadm arrays.
  create a container, with either trusty or xenial inside the container.
  in the container, install mdadm.  Run:

  $ mdadm --monitor --scan --oneshot

  that is the command run by mdadm's cronjob (though other variations on
  the command will also segfault).  With the current mdadm code, mdadm
  will segfault.  With the patched code, mdadm exits normally.

  [regression potential]

  this patch changes mdadm's code that processes each array's name; a
  bug in this area may cause mdadm to fail when performing any operation
  on arrays, but not during the operation, the failure would occur
  before mdadm opened the array.

  [other info]

  this commit fixing this is already upstream and included in zesty and
  later; this is required only for trusty and xenial.

  [original description]

  On Ubuntu 16.04.1 LTS Xenial, mdadm segfaults every day on two machines. Everything works as normal though, and the RAID arrays are not degraded.

  [3712474.763430] mdadm[17665]: segfault at 0 ip 00007fd0369bed16 sp 00007fff8c5c9478 error 4 in libc-2.23.so[7fd036934000+1c0000]
  [3712474.949633] mdadm[17727]: segfault at 0 ip 00007f2814111d16 sp 00007ffca92fe168 error 4 in libc-2.23.so[7f2814087000+1c0000]
  [3798863.008741] mdadm[25359]: segfault at 0 ip 00007fa6af198d16 sp 00007ffc1b253e48 error 4 in libc-2.23.so[7fa6af10e000+1c0000]
  [3798863.190382] mdadm[25393]: segfault at 0 ip 00007f72218a0d16 sp 00007ffef918f118 error 4 in libc-2.23.so[7f7221816000+1c0000]
  [3885251.386711] mdadm[32081]: segfault at 0 ip 00007f3d99ca2d16 sp 00007ffe5e69a7a8 error 4 in libc-2.23.so[7f3d99c18000+1c0000]
  [3885251.402337] mdadm[32083]: segfault at 0 ip 00007f770ccc1d16 sp 00007ffe16074378 error 4 in libc-2.23.so[7f770cc37000+1c0000]
  [3971638.258574] mdadm[7936]: segfault at 0 ip 00007fcacddb3d16 sp 00007ffc062faff8 error 4 in libc-2.23.so[7fcacdd29000+1c0000]
  [3971638.410750] mdadm[8053]: segfault at 0 ip 00007ff573757d16 sp 00007fffd3cca398 error 4 in libc-2.23.so[7ff5736cd000+1c0000]

  The segfault message always appears twice in quick succession.

  It seems to be triggered by /etc/cron.daily/mdadm which essentially runs
  mdadm --monitor --scan --oneshot

  As such, the frequency is around every 85000 seconds or 24 hours give
  or take, depending on when the cron job was executed.

  It does not happen when running the command manually.

  There is one similar bug #1576055 concerning libc and a few cases
  elsewhere but further digging into this has yet to reveal anything
  conclusive.

  Note that these machines have almost exactly the same hardware (Xeon
  D-1518, 16GB ECC), so hardware design flaws cannot be ruled out.
  However, memory testing has not turned up any faults. That said, I
  know some segfaults can be difficult to find even when they are
  hardware issues.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1617919/+subscriptions