APPLIED: [SRU][B][F][PATCH 0/3] Fix LIO from hanging in iscsit_free_session and iscsit_stop_session
Kelsey Skunberg
kelsey.skunberg at canonical.com
Fri Apr 24 17:25:07 UTC 2020
On 2020-04-20 12:39:33 , Kelsey Skunberg wrote:
Applied to bionic/master-next and focal/master-next
-Kelsey
> BugLink: https://bugs.launchpad.net/bugs/1871688
>
> SRU Justification
>
> [Impact]
>
> (Following details are from the bug report)
>
> The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI
> sessions simultaneously. This is reproducible on systems that have multiple
> targets with initiators regularly connecting/disconnecting.
>
> This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is
> executed when a logout operation is underway.
>
> The iscsi target doesn't handle such events in a correct way: two or more
> threads may end up sleeping while waiting for the driver to close the remaining
> connections on the session. When the connections are closed, the driver wakes
> up only the first thread that will then proceed to destroy the session
> structure. The remaining threads are blocked there forever, waiting on a
> completion synchronization mechanism that doesn't exist in memory anymore
> because it has been freed by the first thread.
>
> Note that if the blocked threads are somehow forced to wake up, they will try
> to free the same iSCSI session structure destroyed by the first thread, causing
> double frees, memory corruptions, etc.
>
> The driver has been reorganized so the concurrent threads will set a flag in
> the session structure to notify the driver that the session should be
> destroyed; then, they wait for the driver to close the remaining connections.
> When the connections are all closed, the driver will wake up all the threads
> and will wait for the refcount of the iSCSI session structure to reach zero.
> When the last thread wakes up, the refcount is decreased to zero and the driver
> can proceed to destroy the session structure because no one is referencing it
> anymore.
>
> Bug reporter witnessed this happening on hundreds of Ubuntu 16.04.5 systems.
> States this is a regression, because this did not occur several years ago. No
> detailed records from that far back to determine exactly which kernel reporter
> was running that was not affected by this bug (Believes it was either 4.8.x or
> 4.10.x).
>
> Attached in the bug report is the requested uname, version_signature, dmesg, and
> lspci from reporter's system. However, the reporter has seen this happen on a
> wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel
> CPUs, onboard storage and PCIe SAS cards, etc.
>
> This has been fixed in the upstream master branch, but it hasn't yet been
> backported to "-stable".
>
> [Fixes]
>
> These three commits should be backported:
> * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad
> * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796
> * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e
>
> [Test]
>
> This is reproducible on systems that have multiple targets with initiators
> regularly connecting/disconnecting by having multiple threads try and destroy
> iSCSI sessions simultaneiously.
>
> This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed
> when a logout operation is underway.
>
>
> [Regression Risk]
>
> Low, cherry picked from upstream with no changes.
>
> Verified applies cleanly to Bionic/master-next and Focal/master-next. Build
> tests pass.
>
> Maurizio Lombardi (3):
> scsi: target: remove boilerplate code
> scsi: target: fix hang when multiple threads try to destroy the same
> iscsi session
> scsi: target: iscsi: calling iscsit_stop_session() inside
> iscsit_close_session() has no effect
>
> drivers/target/iscsi/iscsi_target.c | 82 ++++++--------------
> drivers/target/iscsi/iscsi_target.h | 1 -
> drivers/target/iscsi/iscsi_target_configfs.c | 5 +-
> drivers/target/iscsi/iscsi_target_login.c | 5 +-
> include/target/iscsi/iscsi_target_core.h | 2 +-
> 5 files changed, 32 insertions(+), 63 deletions(-)
>
> --
> 2.20.1
More information about the kernel-team
mailing list