[Bug 1899800] Re: Runtime deadlock: pthread_cond_signal failed to wake up pthread_cond_wait due to a bug in undoing stealing
Launchpad Bug Tracker
1899800 at bugs.launchpad.net
Tue Dec 1 10:01:23 UTC 2020
This bug was fixed in the package glibc - 2.32-0ubuntu5
---------------
glibc (2.32-0ubuntu5) hirsute; urgency=medium
* debian/gbp.conf: Set debian-tag and debian-tag-msg to follow Ubuntu format
* Don't build libc6-prof in stage1 and stage2
* Ship libc6-prof on riscv64, too.
This fixes FTBFS on riscv64 due to the the flavour being built but not
shipped in a package.
* Detect debconf consistently in libc6.preinst and do not crash if it is not used
(LP: #1902955)
* Prevent rare deadlock in pthread_cond_signal (LP: #1899800)
* debian/patches/git-updates.diff: update from upstream stable branch
glibc (2.32-0ubuntu4) hirsute; urgency=medium
* tests: XFAIL time/tst-cpuclock1 on armel, too. (LP: #1895687)
The armhf build builds for armel, too, thus this fixes the armhf
autopkgtest.
* debian/control: Only recommend libnss-nis and libnss-nisplus.
They pull in a sizable amount of extra dependencies while they are rarely
needed.
* Make libc6 provide libc6-lse on arm64.
Libc6 is now compiled with -moutline-atomics thus the separate binary
package is dropped.
* Ship libc variant compiled for profiling in libc6-prof
* debian/patches/git-updates.diff: update from upstream stable branch
* Drop obsoleted local-cudacc-float128.diff which breaks new icc
(LP: #1895358)
* XFAIL tst-sysvshm-linux on i386 and x32
* Merge 2.31-4 from Debian unstable
-- Balint Reczey <rbalint at ubuntu.com> Fri, 13 Nov 2020 18:54:38 +0100
** Changed in: glibc (Ubuntu)
Status: New => Fix Released
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to glibc in Ubuntu.
https://bugs.launchpad.net/bugs/1899800
Title:
Runtime deadlock: pthread_cond_signal failed to wake up
pthread_cond_wait due to a bug in undoing stealing
Status in glibc package in Ubuntu:
Fix Released
Status in glibc source package in Bionic:
New
Status in glibc source package in Focal:
New
Status in glibc source package in Groovy:
New
Bug description:
This bug was submitted by Qin Li to glibc bugzilla earlier this year,
with a one-line patch, though it hasn't been merged into glibc yet:
https://sourceware.org/bugzilla/show_bug.cgi?id=25847
This bug in pthread conditions will deadlock the OCaml runtime, as
well as Python's runtime, and .NET.
The bug was introduced in glibc 2.27, so affects Ubuntu 18.04 onwards.
I confirm my OCaml app, as well as the repro from the bugzilla,
deadlocks on Ubuntu 20.04 and Ubuntu 18.04. To further strengthen the
case that this is because of a bug in glibc, my app and the repro do
not deadlock on Ubuntu 16.04.
To rule out kernel issues, I further confirm that no deadlock happens
when I copy Ubuntu 16.04's libc to 18.04 and redirect the dynamic
linker so my app loads the earlier libc.
I confirm that the one-line patch (available at the above bugzilla)
applies cleanly on top of:
* glibc-2.31-0ubuntu9.1 (Ubuntu 20.04 latest)
* glibc-2.28-10 (Debian Buster/10 latest)
* glibc-2.27-3ubuntu1.2 (Ubuntu 18.04 latest)
I confirm that the one-line patch to glibc cures the deadlock issue in
my OCaml apps.
On Ubuntu 20.04, I have not been able to get the repro to deadlock in
5 days. My OCaml apps have not deadlocked in 5 days.
On Debian Buster/10, the repro has not deadlocked in about 5 days.
This is my desktop box, and I can otherwise use normal applications as
usual like the GNOME environment, etc.
On Ubuntu 18.04, the repro takes about 24-48 hours before it triggers
a deadlock. Prior to patching glibc, it would take only a few hours.
I have not seen my OCaml apps deadlock since applying this patch,
however.
On Ubuntu 16.04 I have not been able to get the repro to deadlock
ever. My OCaml apps never deadlocked on this platform. This is
expected, since this platform runs glibc 2.23, where the bug has not
been introduced yet (the bugzilla report claims introduced in 2.27).
As for why 18.04 still deadlocks, I suspect another, unrelated pthread
bug was introduced in glibc 2.27 and fixed by 2.28. When applied to
glibc 2.27, the one-line patch appears to significantly reduce the
deadlocking by an order of magnitude.
Please kindly consider merging the patch into Ubuntu glibc.
More background about this bug, for the sake of future internet searchers:
* https://discuss.ocaml.org/t/is-there-a-known-recent-linux-locking-bug-that-affects-the-ocaml-runtime
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1899800/+subscriptions
More information about the foundations-bugs
mailing list