[Bug 1822776] Re: Apply Bash 4.4.20 to fix cpu spinning on built-in wait
Bryce Harrington
1822776 at bugs.launchpad.net
Wed May 29 17:12:35 UTC 2019
Hi halfgaar,
Robie asked me to help you with this bug, thanks for reporting it. I
may not be able to get full attention on this until next week due to a
project deadline, but I've had a quick look at the patch and your
problem description, and it looks pretty straightforward. Thanks also
for the test case, I'll run it and see if I can repro the bug myself.
It looks like both bionic and cosmic are running 4.4.18-x, so I'm
gathering cosmic will need the fix as well. disco and eoan have moved
to bash 5.0, and I've verified the upstream source code includes the
fix, so no changes are needed for those distro releases.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to bash in Ubuntu.
https://bugs.launchpad.net/bugs/1822776
Title:
Apply Bash 4.4.20 to fix cpu spinning on built-in wait
Status in bash package in Ubuntu:
New
Status in bash source package in Bionic:
New
Status in bash source package in Cosmic:
New
Bug description:
[Impact]
Long running bash loops that create and reap processes will crash,
hanging at 100% CPU.
[Test Case]
Run this loop for a few days/weeks:
#!/bin/bash
while true; do
sleep 0.5 &
wait
done
It will eventually cause the 'wait' statement to hang, consuming 100%
after some indeterminate amount of time, dependent on how fast PIDs
are cycled in the machine.
The Bash bug report mentions longer running loops, but it seems hash
collisions are the cause, meaning it's just a matter of chance,
influenced by how fast PIDs are cycled on the machine.
[Regression Potential]
The fix has been reviewed and accepted upstream. The patch adds a
test at time of pid determination for if the pid is already in use and
if so, skip it and pick a different one. This does change behavior
slightly in that different pid numbers will be generated in rare
cases, but nothing should depend on how pids are generated, as the
behavior is not specified to be anything but random.
The patch adds a new warning message, "bgp_delete: LOOP: psi (%d) ==
storage[psi].bucket_next", but this only shows when the original bug
would have been triggered.
Using 'apt-get source bash' to get the original source version, I
created a deb that includes the 4.4.20 patch and have been running it
since April 2nd. The 100% CPU spinning is solved, and no other
regressions have been observed.
Ubuntu 18.04 is already at 4.4.19, which is one patch level behind, so
this involves linearly progressing to the next version (so not
skipping patches).
[Fix]
Official patch to fix, and to bump to 4.4.20:
http://ftp.gnu.org/gnu/bash/bash-4.4-patches/bash44-020
The newest Ubuntu tar.xz with patches I could find at:
http://archive.ubuntu.com/ubuntu/pool/main/b/bash/
also didn't have the 4.4.20 patch, so it seems no Ubuntu release has
the fix yet.
Although not completely sure, this problem seems to have been
introduced in the 4.4 version of Bash, so in term of LTS versions,
18.04 and up are affected.
[Original Report]
Bash pre-4.4.20 has a bug in its PID hash table that causes spin-loops when spawning sub processes and waiting for them. There is a fix:
https://ftp.gnu.org/gnu/bash/bash-4.4-patches/bash44-020
Our application started being affected (locking up) by this since
migrating from Ubuntu 14.04 to 18.04. Ubuntu 14.04 has bash 4.3.11(1),
Ubuntu 18.04 has bash 4.4.19 (that is, when running 'bash --version',
because of their unusual versions as patches, apt shows it as
4.4.18-2ubuntu1).
The 4.4-020 version needs to be included. I think it's actually quite
critical.
A justification for including the fix would be that a standard
language feature in a script language is broken, and that it's
indeterminate when it breaks. Considering the wide spread use of bash,
I'm surprised not more people have reported issues. My and a client
started having issues with independently of each other very soon after
upgrading to an affected version.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/bash/+bug/1822776/+subscriptions
More information about the foundations-bugs
mailing list