ACK/Cmnt: [SRU][J][I][F][PATCH v2 0/2] rcu stalls with many storage key guests (LP: 1975582)
Stefan Bader
stefan.bader at canonical.com
Wed Jun 15 07:50:27 UTC 2022
On 10.06.22 14:55, frank.heimes at canonical.com wrote:
> BugLink: https://bugs.launchpad.net/bugs/1975582
>
> SRU Justification:
>
> [Impact]
>
> * Ubuntu on s390x KVM environments with lots of large guests with storage
> keys can be affected by rcu stalls.
>
> * These rcu stalls can cause the system to crash/dump.
>
> [Fix]
>
> * 3ae11dbcfac9 3ae11dbcfac906a8c3a480e98660a823130dc16a "s390/mm: use non-quiescing sske for KVM switch to keyed guest"
>
> * 6d5946274df1 6d5946274df1fff539a7eece458a43be733d1db8 "s390/gmap: voluntarily schedule during key setting"
>
> [Test Plan]
>
> * There is no trigger or direct test or re-creation of the
> problem situation possible, but...
>
> * and IBM z13 or LinuxONE (or never) LPAR is needed that
> runs Ubuntu Server 20.04 LTS or 18.04 LTS with HWE kernel
> and acts as KVM host with again several large guests running
> on top with storage groups.
>
> * Let such a system running for days under significant load
> and watch the logs for rcu issues.
>
> * Prior to the submission of this SRU patched test kernels
> for focal 5.4 and bionic hwe-5.4 were created and tested.
> They ran for days at a staging environemnt at IBM
> without further issues.
>
> * The modifications are all limited to s390x.
>
> * A test kernel was build (see below) that ran in a test environment
> at IBM under appropriate load for several days.
>
> [Where problems could occur]
>
> * Due to the change for the KVM switch to keyed guest
> from classic sske to non-quiescing sske
> the KVM behaviour might have changed and the storage keys harmed.
>
> * The now more generous scheduling while setting keys
> has an impact on the guest memory management and mapping
> which will lead to a different performance.
>
> * This, with the introduction of __s390_enable_skey_pmd and
> cond_resched, might increase the overhead in certain situations,
> but eventually improves the responsiveness over time,
> hence avoid rcu stalls.
>
> [Other Info]
>
> * Since the patches are upstream in 5.19-rc1,
> they will be included in the kernel that is planned for kinetic (5.19).
>
> * Hence this is an SRU to jammy, impish and focal.
>
> v2: since this SRU is not only for J, but also for I and F
>
> Christian Borntraeger (2):
> s390/gmap: voluntarily schedule during key setting
> s390/mm: use non-quiescing sske for KVM switch to keyed guest
>
> arch/s390/mm/gmap.c | 14 ++++++++++++++
> arch/s390/mm/pgtable.c | 2 +-
> 2 files changed, 15 insertions(+), 1 deletion(-)
>
For Impish, there is a chance that this will not make it. There is only one
cycle until EOL, so if this important it would be good if you explicitly
mentioned this (as a reply here).
Acked-by: Stefan Bader <stefan.bader at canonical.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20220615/e529ceaa/attachment.sig>
More information about the kernel-team
mailing list