ACK: [SRU][F][PULL] CryptoExpress EP11 cards are going offline
Krzysztof Kozlowski
krzysztof.kozlowski at canonical.com
Tue Aug 31 13:53:29 UTC 2021
On 31/08/2021 15:19, frank.heimes at canonical.com wrote:
> From: Frank Heimes <frank.heimes at canonical.com>
>
> BugLink: https://bugs.launchpad.net/bugs/1939618
>
> [Impact]
>
> * With current focal kernels IBM Z CryptoExpress adapters in EP11 mode go offline
> in case of unknown error indications from the hardware.
>
> * This does not only lead to a software fallback,
> but can also lead to errors and crashes,
> if certain crypto operations are currently ongoing.
>
> * A rework of the AP bus and zcrypt device driver,
> as it was done in 5.11, fixes the situation.
>
> * From the below range of commits,
> the last 1/3 are the ones that fix the issue mentioned here
> and the others are pre-requisites to get the relevant ones applied.
>
> * In theory the patch set could have been made smaller,
> but with the cost that the code would be a mix between old and new,
> with maybe some new code snippets,
> hence it would divert from what's upstream accepted (in 5.11 and above),
> the risk would increase,
> increased effort to maintain and less test coverage.
>
> [Fix]
>
> * The SRU request was created as pull request,
> so please pull f904c400c9c4^..f6d9ab1de03a
> (means starting at f904c400c9c4 to head/f6d9ab1de03a - both included)
> from here: https://code.launchpad.net/~fheimes/+git/lp1939618
>
> [Test Case]
>
> * An Ubuntu Server 20.04 on IBM Z or LinuxONE installation is required,
> with ideally three attached CryptoExpress adapters running
> in CCA, EP11 and accelerator mode.
>
> * Run stress test on these three CryptoExpress adapters.
>
> * IBM has such stress tests and ran these based on a patched Ubuntu 20.04 kernel.
> The tests come with a specially focus on error path tests,
> since this patch set mainly focuses on doing a better error patch handling.
>
> * Note: A a new config option for the zcrypt driver was introduced
> that enables the possibility to inject erroneous messages.
>
> * An application exists that generates such messages and thus tests these error paths.
>
> * Canonical's focus will be mainly on regression testing.
>
> [Regression Potential]
>
> * Like with all modification there is a certain risk of regressions,
> especially with bigger patch sets.
>
> * But the modifications here are limited to the s390x platform,
> and there again largely to the s390x hardware crypto stack and driver
> (CryptoExpress adapter) which is optional hardware.
> (See the diff stat in the comment below.)
>
> * The crypto-specific tools (located at the s390-tools package) may no longer work
> with this patched driver.
> But this got tested by IBM with the result that the changes are fully backward compatible.
> The 'older' s390 tools package (from focal) can just not show and control the new (config state) feature,
> but the functionality covered by the older s390 tools package is utterly covered by this patch set.
>
> * The core of this patch set went into the 5.11 kernel upstream,
> hence is in hirsute (and has also been picked by other distros).
>
> * Since this patch set is a rework of the AP bus and zcrypt driver code,
> it may now show new errors that were never thrown before, like for or example memory leaks.
> However, this is not unique to this patch set,
> it the same for upstream, Hirsute and Impish (and other distros).
>
> * The patches are all upstream and all needed upstream commits could just be cherry-picked,
> hence no modifications were needed.
>
> * So the commits were not only tested by IBM upfront,
> but a patched focal master-next kernel is also available as PPA (see comment below) for further testing.
>
> * This patch set was also tested on 5.11, where two issues were found that are already part of this set.
>
> [Other]
>
> * I iterated through all commits and found that that the latest ones got upstream with 5.13,
> hence Impish includes all commits needed and is not affected!
>
> * Looks like all commits, expect three, are even upstream with 5.11,
> but the missing three came in on top via upstream stable,
> hence Hirsute master-next includes all commits needed too and is also not affected!
>
> * But non of the commits could be found in current Focal master-next (aot: 5.4.0-84),
> the first commits from this set started to land with 5.7,
> hence this SRU request is for focal only.
>
> ---
>
> The following changes since commit 9e4ec1b8ea389754e30927a98a63f3ffa6e664a7:
>
> UBUNTU: upstream stable to v5.4.140 (2021-08-27 15:52:30 -0600)
>
> are available in the Git repository at:
>
> git://git.launchpad.net/~fheimes/+git/lp1939618 f6d9ab1de03a36af1a3add6a31642bd6e8dbfd75
>
> for you to fetch changes up to f6d9ab1de03a36af1a3add6a31642bd6e8dbfd75:
>
> s390/ap: Fix hanging ioctl caused by wrong msg counter (2021-08-30 09:13:20 +0200)
>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski at canonical.com>
Best regards,
Krzysztof
More information about the kernel-team
mailing list