[Bug 1836329] Re: Regression running ssllabs.com/ssltest causes 2 apache process to eat up 100% cpu, easy DoS
Andreas Hasenack
andreas at canonical.com
Tue Jul 16 19:59:51 UTC 2019
** Description changed:
[Impact]
With latest apache 2.4.29-1ubuntu4.7 published to 18.04 LTS bionic, when running ssllabs.com/ssltest against it to verify the configuration it leaves 2 apache processes using 100% indefinitely.
Downgrading to 2.4.29-1ubuntu4.6 make it not reproducible anymore.
[Test Case]
We didn't find a reproducer that didn't involve https://ssllabs.com/ssltest, so the test case needs a publicly facing server with a DNS record.
On a test system that has a public IP and is reachable via https on a hostname (not just IP):
sudo apt update
sudo apt install apache2
sudo a2enmod ssl
sudo a2ensite default-ssl.conf
sudo service apache2 restart
In a terminal, monitor the apache2 processes CPU usage with top.
Go to https://www.ssllabs.com/ssltest/ and input the url to your test
server, using https. After a few seconds, the site will ask you if it
should ignore the certificate error, confirm, and let it continue the
test.
After a few minutes, the test will finish and you will get a report. Go
back to the terminal where top is running, and the apache2 processes
will be spinning and using CPU, even though there isn't anymore traffic.
With the fixed packages, the apache processes will remain idle.
[Regression Potential]
-
- * discussion of how regressions are most likely to manifest as a result
- of this change.
-
- * It is assumed that any SRU candidate patch is well-tested before
- upload and has a low overall risk of regression, but it's important
- to make the effort to think about what ''could'' happen in the
- event of a regression.
-
- * This both shows the SRU team that the risks have been considered,
- and provides guidance to testers in regression-testing the SRU.
+ This upload is already fixing a regression which fixed a previous regression (#1833039), which shows that the situation is tricky. The fix here (clear-retry-flags-before-abort.patch) is at least not changing anything in the previous patch from bug #1833039, so that fix was correct.
+ The second patch, for http/2 errors with openssl 1.1.1, unfortunately has no test case, and deals with error status and is specific to openssl 1.1.1. It's been applied upstream (and backported to the 2.4.x branch) for many months now. The trunk commit at http://svn.apache.org/viewvc?view=revision&revision=1843954 has a more elaborate explanation about behavior changes this does, and doesn't, introduce.
[Other Info]
While investigating this issue, another fix for an openssl 1.1.1 issue was found in the apache upstream git repo which involves http2 and how the code handles SSL_read() return values: https://github.com/apache/httpd/commit/644cff9977efa322fe6c0ae3357a5b8cb1eeec11
No upstream bug was found, nor could I come up with a reproducer case, but it seemed sensible to include that patch in this SRU, which was, after all, triggered by the openssl 1.1.1 upgrade in bionic.
[Original Description]
With latest apache 2.4.29-1ubuntu4.7 published to 18.04 LTS bionic, when
running ssllabs.com/ssltest against it to verify the configuration it
leaves 2 apache processes using 100% indefinitely.
Downgrading to 2.4.29-1ubuntu4.6 make it not reproducible anymore.
So far i do not know if it is easy/likely to hit this case in normal
https usage or only triggered by that testing site.
But given that this is backported to LTS and allows easy DoS maybe the
4.7 should be backed out?
So likely regression in the update to 4.7 having only single fix:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1833039
Extra info observed when that ssltest is over but processes are still
there using up cpu:
/server-status shows both processes 25234,25235 here in 'Reading' state:
Srv PID Acc M CPU SS Req Conn Child Slot Client Protocol VHost Request
0-0 25234 0/0/0 W 0.00 0 0 0.0 0.00 0.00 127.0.0.1 http/1.1 ip-172-30-1-107.eu-west-1.compu GET /server-status HTTP/1.1
0-0 25234 0/0/0 R 0.00 641 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 505 2 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 501 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 500 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 494 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 604 0 0.0 0.00 0.00 64.41.200.106 http/1.1
1-0 25235 0/1/1 _ 0.00 604 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 16.93 596 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.01 595 1 0.0 0.00 0.00 64.41.200.106 http/1.1
1-0 25235 0/0/0 R 0.00 679 0 0.0 0.00 0.00 64.41.200.106 http/1.1
netstat on system:
tcp6 1 0 172.30.1.57:443 64.41.200.106:58658 CLOSE_WAIT
tcp6 1 0 172.30.1.57:443 64.41.200.107:60842 CLOSE_WAIT
with on other connections to 443 port.
** Description changed:
[Impact]
With latest apache 2.4.29-1ubuntu4.7 published to 18.04 LTS bionic, when running ssllabs.com/ssltest against it to verify the configuration it leaves 2 apache processes using 100% indefinitely.
Downgrading to 2.4.29-1ubuntu4.6 make it not reproducible anymore.
[Test Case]
We didn't find a reproducer that didn't involve https://ssllabs.com/ssltest, so the test case needs a publicly facing server with a DNS record.
On a test system that has a public IP and is reachable via https on a hostname (not just IP):
sudo apt update
sudo apt install apache2
sudo a2enmod ssl
sudo a2ensite default-ssl.conf
sudo service apache2 restart
In a terminal, monitor the apache2 processes CPU usage with top.
Go to https://www.ssllabs.com/ssltest/ and input the url to your test
server, using https. After a few seconds, the site will ask you if it
should ignore the certificate error, confirm, and let it continue the
test.
After a few minutes, the test will finish and you will get a report. Go
back to the terminal where top is running, and the apache2 processes
will be spinning and using CPU, even though there isn't anymore traffic.
With the fixed packages, the apache processes will remain idle.
[Regression Potential]
This upload is already fixing a regression which fixed a previous regression (#1833039), which shows that the situation is tricky. The fix here (clear-retry-flags-before-abort.patch) is at least not changing anything in the previous patch from bug #1833039, so that fix was correct.
The second patch, for http/2 errors with openssl 1.1.1, unfortunately has no test case, and deals with error status and is specific to openssl 1.1.1. It's been applied upstream (and backported to the 2.4.x branch) for many months now. The trunk commit at http://svn.apache.org/viewvc?view=revision&revision=1843954 has a more elaborate explanation about behavior changes this does, and doesn't, introduce.
+ We do have a DEP8 test that covers HTTP/2 SSL downloads, and it passes. But it also passed before this patch.
[Other Info]
While investigating this issue, another fix for an openssl 1.1.1 issue was found in the apache upstream git repo which involves http2 and how the code handles SSL_read() return values: https://github.com/apache/httpd/commit/644cff9977efa322fe6c0ae3357a5b8cb1eeec11
No upstream bug was found, nor could I come up with a reproducer case, but it seemed sensible to include that patch in this SRU, which was, after all, triggered by the openssl 1.1.1 upgrade in bionic.
[Original Description]
With latest apache 2.4.29-1ubuntu4.7 published to 18.04 LTS bionic, when
running ssllabs.com/ssltest against it to verify the configuration it
leaves 2 apache processes using 100% indefinitely.
Downgrading to 2.4.29-1ubuntu4.6 make it not reproducible anymore.
So far i do not know if it is easy/likely to hit this case in normal
https usage or only triggered by that testing site.
But given that this is backported to LTS and allows easy DoS maybe the
4.7 should be backed out?
So likely regression in the update to 4.7 having only single fix:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1833039
Extra info observed when that ssltest is over but processes are still
there using up cpu:
/server-status shows both processes 25234,25235 here in 'Reading' state:
Srv PID Acc M CPU SS Req Conn Child Slot Client Protocol VHost Request
0-0 25234 0/0/0 W 0.00 0 0 0.0 0.00 0.00 127.0.0.1 http/1.1 ip-172-30-1-107.eu-west-1.compu GET /server-status HTTP/1.1
0-0 25234 0/0/0 R 0.00 641 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 505 2 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 501 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 500 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 494 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 604 0 0.0 0.00 0.00 64.41.200.106 http/1.1
1-0 25235 0/1/1 _ 0.00 604 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 16.93 596 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.01 595 1 0.0 0.00 0.00 64.41.200.106 http/1.1
1-0 25235 0/0/0 R 0.00 679 0 0.0 0.00 0.00 64.41.200.106 http/1.1
netstat on system:
tcp6 1 0 172.30.1.57:443 64.41.200.106:58658 CLOSE_WAIT
tcp6 1 0 172.30.1.57:443 64.41.200.107:60842 CLOSE_WAIT
with on other connections to 443 port.
** Description changed:
[Impact]
With latest apache 2.4.29-1ubuntu4.7 published to 18.04 LTS bionic, when running ssllabs.com/ssltest against it to verify the configuration it leaves 2 apache processes using 100% indefinitely.
Downgrading to 2.4.29-1ubuntu4.6 make it not reproducible anymore.
[Test Case]
We didn't find a reproducer that didn't involve https://ssllabs.com/ssltest, so the test case needs a publicly facing server with a DNS record.
On a test system that has a public IP and is reachable via https on a hostname (not just IP):
sudo apt update
sudo apt install apache2
sudo a2enmod ssl
sudo a2ensite default-ssl.conf
sudo service apache2 restart
In a terminal, monitor the apache2 processes CPU usage with top.
Go to https://www.ssllabs.com/ssltest/ and input the url to your test
server, using https. After a few seconds, the site will ask you if it
should ignore the certificate error, confirm, and let it continue the
test.
After a few minutes, the test will finish and you will get a report. Go
back to the terminal where top is running, and the apache2 processes
will be spinning and using CPU, even though there isn't anymore traffic.
With the fixed packages, the apache processes will remain idle.
[Regression Potential]
This upload is already fixing a regression which fixed a previous regression (#1833039), which shows that the situation is tricky. The fix here (clear-retry-flags-before-abort.patch) is at least not changing anything in the previous patch from bug #1833039, so that fix was correct.
The second patch, for http/2 errors with openssl 1.1.1, unfortunately has no test case, and deals with error status and is specific to openssl 1.1.1. It's been applied upstream (and backported to the 2.4.x branch) for many months now. The trunk commit at http://svn.apache.org/viewvc?view=revision&revision=1843954 has a more elaborate explanation about behavior changes this does, and doesn't, introduce.
- We do have a DEP8 test that covers HTTP/2 SSL downloads, and it passes. But it also passed before this patch.
+ We do have a DEP8 test that covers HTTP/2 SSL downloads, and it passes. But it also passed before this patch. I also manually tried such downloads of varying sizes (up to 10Mbytes) with no failures.
[Other Info]
While investigating this issue, another fix for an openssl 1.1.1 issue was found in the apache upstream git repo which involves http2 and how the code handles SSL_read() return values: https://github.com/apache/httpd/commit/644cff9977efa322fe6c0ae3357a5b8cb1eeec11
No upstream bug was found, nor could I come up with a reproducer case, but it seemed sensible to include that patch in this SRU, which was, after all, triggered by the openssl 1.1.1 upgrade in bionic.
[Original Description]
With latest apache 2.4.29-1ubuntu4.7 published to 18.04 LTS bionic, when
running ssllabs.com/ssltest against it to verify the configuration it
leaves 2 apache processes using 100% indefinitely.
Downgrading to 2.4.29-1ubuntu4.6 make it not reproducible anymore.
So far i do not know if it is easy/likely to hit this case in normal
https usage or only triggered by that testing site.
But given that this is backported to LTS and allows easy DoS maybe the
4.7 should be backed out?
So likely regression in the update to 4.7 having only single fix:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1833039
Extra info observed when that ssltest is over but processes are still
there using up cpu:
/server-status shows both processes 25234,25235 here in 'Reading' state:
Srv PID Acc M CPU SS Req Conn Child Slot Client Protocol VHost Request
0-0 25234 0/0/0 W 0.00 0 0 0.0 0.00 0.00 127.0.0.1 http/1.1 ip-172-30-1-107.eu-west-1.compu GET /server-status HTTP/1.1
0-0 25234 0/0/0 R 0.00 641 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 505 2 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 501 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 500 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 494 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.00 604 0 0.0 0.00 0.00 64.41.200.106 http/1.1
1-0 25235 0/1/1 _ 0.00 604 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 16.93 596 0 0.0 0.00 0.00 64.41.200.107 http/1.1
1-0 25235 0/1/1 _ 0.01 595 1 0.0 0.00 0.00 64.41.200.106 http/1.1
1-0 25235 0/0/0 R 0.00 679 0 0.0 0.00 0.00 64.41.200.106 http/1.1
netstat on system:
tcp6 1 0 172.30.1.57:443 64.41.200.106:58658 CLOSE_WAIT
tcp6 1 0 172.30.1.57:443 64.41.200.107:60842 CLOSE_WAIT
with on other connections to 443 port.
--
You received this bug notification because you are a member of Ubuntu
Server, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1836329
Title:
Regression running ssllabs.com/ssltest causes 2 apache process to eat
up 100% cpu, easy DoS
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1836329/+subscriptions
More information about the Ubuntu-server-bugs
mailing list