[Bug 1020210]

Neleai 1020210 at bugs.launchpad.net
Sat Dec 21 00:39:43 UTC 2013


Carlos, this is faster to debug on paper than trying debug optimized
program.

For minimal example what is wrong I could trigger assert for unoptimized
version of malloc. In optimized version you need go to assembly to see
where gcc scheduled loads.

Idea is simple, while we free one chunk then a chunk on top of fastbin
could be in other thread allocated, resized and then returned back into
top of fastbin to trigger assertion or seqfault when trim unmaps
corresponding page.

A program is following,

#include <stdlib.h>
#include <pthread.h>

void * freea (void *p)
{  
  free (p); // 1
}

int main ()
{
  pthread_t x;
  char *u, *v;
  u = malloc (16);
  pthread_create (&x, NULL, freea, u);
      
  v = malloc (16); 
  free (v); // 2

  malloc_trim (0);
  v = malloc (512); // 3
  free (v);

  malloc_trim (0);
  v = malloc (16);  
  free (v); // 4
}

First step into free 1 until you get to this fragment.
   
Here run free 2 so v gets into top of fastbin.

    unsigned int idx = fastbin_index(size); // 32 >> 4 = 2
    fb = &fastbin (av, idx);

    mchunkptr fd;
    mchunkptr old = *fb; // v
    unsigned int old_idx = ~0u;
    do
      {
        /* Another simple check: make sure the top of the bin is not the
           record we are going to add (i.e., double free).  */
        if (__builtin_expect (old == p, 0))
          {
            errstr = "double free or corruption (fasttop)";
            goto errout;
          }

Now here run step 3 where v is chunk of size 528


        if (old != NULL)
          old_idx = fastbin_index(chunksize(old)); // 528 >> 4 = 33
        p->fd = fd = old;

And continue by step 4 which returns v into top of fastbin. which is
same state as at 2.

      }
    while ((old = catomic_compare_and_exchange_val_rel (fb, p, fd)) != fd);

And as 33 != 2 we cause an error.
    
    if (fd != NULL && __builtin_expect (old_idx != idx, 0))
      {
        errstr = "invalid fastbin entry (free)";
        goto errout;
      }

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to eglibc in Ubuntu.
https://bugs.launchpad.net/bugs/1020210

Title:
  Race condition using ATOMIC_FASTBINS in _int_free causes crash or heap
  corruption

Status in Embedded GLIBC:
  Fix Released
Status in “eglibc” package in Ubuntu:
  Confirmed

Bug description:
  We have an application which makes heavy allocation and de-allocation
  demands from multiple threads.  We run this application continuously
  on many servers, and once every several CPU months or years, we were
  getting a crash in _int_free that did not look like vanilla heap
  corruption.  I believe I have narrowed it down to a race condition in
  _int_free due to the ATOMIC_FASTBINS feature.  Basically, in the
  lockless FASTBIN _int_free path, a chunk is pulled into a local
  variable with the intent to add it to the fastbins list.  However, the
  heap consolidation/trim code can race with this, and can coalesce the
  entire block and/or give it back to the OS before _int_free has a
  chance to try and store it into the fastbins list.

  The problem is very challenging to reproduce in situ, but using gdb I
  have a recipe which demonstrates the crash 100% of the time on my
  12.04 x64 system running eglibc 2.15.  It relies on malloc_trim,
  although in our in situ data, the consolidation is triggered as a
  result of a normal free.  malloc_trim is just easier to control.

  While I am not a glibc developer, I could not see any easy ways to fix
  the situation shy of disabling ATOMIC_FASTBINS.

  I am attaching the reproduction source.  Other pertinent information
  follows:

  > jpieper at calculon:~/downloads$ lsb_release -rd
  > Description:	Ubuntu 12.04 LTS
  > Release:	12.04

  > jpieper at calculon:~/downloads$ apt-cache policy libc6
  > libc6:
  >   Installed: 2.15-0ubuntu10
  >   Candidate: 2.15-0ubuntu10
  >   Version table:
  >  *** 2.15-0ubuntu10 0
  >        500 http://us.archive.ubuntu.com/ubuntu/ precise/main amd64 Packages
  >        100 /var/lib/dpkg/status

  What I expect: I expect the attached application, when run using the gdb script in the comments, to complete with no failures.
  What happened: A SIGSEGV after the final continue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/eglibc/+bug/1020210/+subscriptions



More information about the foundations-bugs mailing list