[Bug 2033422] [NEW] openssl: backport to jammy "clear method store / query cache confusion"

Launchpad Bug Tracker 2033422 at bugs.launchpad.net
Thu Jan 4 13:41:15 UTC 2024


You have been subscribed to a public bug by Adrien Nader (adrien-n):

=== SRU information ===
[ATTENTION]
This SRU contains THREE changes which are listed in the section below.

[Meta]
This bug is part of a series of three bugs for a single SRU.
This ( #2033422 ) is the "central" bug with the global information and debdiff.

This SRU addresses three issues with Jammy's openssl version:
- http://pad.lv/1994165: ignored SMIME signature errors
- http://pad.lv/2023545: imbca engine dumps core
- http://pad.lv/2033422: very high CPU usage for concurrent TLS connections (this one)

The SRU information has been added to the three bug reports and I am
attaching the debdiff here only for all three.

All the patches have been included in subsequent openssl 3.0.x releases
which in turn have been included in subsequent Ubuntu releases. There
has been no report of issues when updating to these Ubuntu releases.

I have rebuilt the openssl versions and used abi-compliance-checker to
compare the ABIs of the libraries in jammy and the one for the SRU. Both
matched completely (FYI, mantic's matched completely too).

I have also pushed the code to git (without any attempt to make it git-
ubuntu friendly).

https://code.launchpad.net/~adrien-n/ubuntu/+source/openssl/+git/openssl/+ref/jammy-
sru

I asked Brian Murray about phasing speed and he concurs a slow roll-out is probably better for openssl. There is a small uncertainty because a security update could come before the phasing is over, effectively fast-forwarding the SRU. Still, unless there is already a current pre-advisory, this is probably better than a 10% phasing which is over after only a couple days anyway.
NB: at the moment openssl doesn't phase slowly so this needs to be implemented.

[Impact]
Severely degraded performance for concurrent operations compared to openssl 1.1. The performance is so degraded that some workloads fail due to timeouts or insufficient resources (noone magically has 5 times more machines). As a consequence, a number of people use openssl 1.1 instead and do not get security updates.

[Test plan]
Rafael Lopez has shared a simple benchmarks in http://pad.lv/2009544 with https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/2009544/+attachment/5690224/+files/main.py .

To test, follow these steps:
- run "time python3 main.py" # using the aforementioned main.py script
- apt install -t jammy-proposed libssl3
- run "time python3 main.py"
- compare the runtimes for the two main.py runs

You can run this on x86_64, Raspberry Pi 4 or any machine, and get a
very large speed-up in all cases. The improvements are not architecture-
dependant.

Using this changeset, I get the following numbers for ten runs on my
laptop:

3.0.2:
    real  2m5.567s
    user  4m3.948s
    sys   2m0.233s

this SRU:
    real  0m23.966s
    user  2m35.687s
    sys   0m1.920s

As can be easily seen, the speed-up is massive: system time is divided
by 60 and overall wall clock time is roughly five times lower.

In http://pad.lv/2009544 , Rafael also shared his performance numbers
and they are relatable to these. He used slightly different versions
(upstreams rather than patched with cherry-picks) but at least one of
the version used does not include other performance change. He also used
different hardware and this performance issue seems to depend on the
number of CPUs available but also obtained a performance several times
better. Results on a given machine vary also very little across runs
(less than 2% variation on runs of size 10). They are also very similar
on a Raspberry Pi 4 (8GB).

The benchmark uses https://www.google.com/humans.txt which takes around
130ms to download on my machine but I modified the script to download
something only 20ms away. Results are so close to the ones using
humans.txt that they are within the error margin. This is consistent
with the high-concurrency in the benchmark which both saturates CPU, and
"hides" latencies that are relatively low.

Finally, there are positive reports on github. Unfortunately they are
not always completely targeted at these patches only and therefore I
will not link directly to them but they have also been encouraging.

[Where problems could occur]
The change is spread over several patches which touch the internals of openssl. As such, the engine and provider functionality could be broken by these changes. Fortunately, in addition to upstream's code review, these patches are included in openssl 3.0.4 (iirc) and therefore in kinetic. No issue related to these changes was reported on launchpad or upstream.

However, it is possible that there were more patch dependencies than
these in either 3.0.3 or 3.0.4. In that case there could be problems.

[Patches]
The patches come directly from upstream and apply cleanly.

https://github.com/openssl/openssl/pull/18151#issuecomment-1118535602

* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0001-Drop-ossl_provider_clear_all_operation_bits-and-all-.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0002-Refactor-method-construction-pre-and-post-condition.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0003-Don-t-empty-the-method-store-when-flushing-the-query.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0004-Make-it-possible-to-remove-methods-by-the-provider-t.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0005-Complete-the-cleanup-of-an-algorithm-in-OSSL_METHOD_.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0006-For-child-libctx-provider-don-t-count-self-reference.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0007-Add-method-store-cache-flush-and-method-removal-to-n.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0

=== Original description ===

This is about SRU'ing to Jammy the patches at
https://github.com/openssl/openssl/pull/18151#issuecomment-1118535602 .
They're purely performance but their impact is large. They have been
released as part of openssl 3.0.4 (they're among the first after 3.0.3)
which has been included in Kinetic.

** Affects: openssl (Ubuntu)
     Importance: Undecided
         Status: New

** Affects: openssl (Ubuntu Jammy)
     Importance: Medium
     Assignee: Adrien Nader (adrien-n)
         Status: In Progress

** Affects: openssl (Ubuntu Lunar)
     Importance: Undecided
         Status: Fix Released

-- 
openssl: backport to jammy "clear method store / query cache confusion"
https://bugs.launchpad.net/bugs/2033422
You received this bug notification because you are a member of Ubuntu Sponsors, which is subscribed to the bug report.



More information about the Ubuntu-sponsors mailing list