ACK: [SRU][B][F][PATCH 0/3] Fix LIO from hanging in iscsit_free_session and iscsit_stop_session
kamal at canonical.com
Thu Apr 23 16:59:51 UTC 2020
Acked-by: Kamal Mostafa <kamal at canonical.com>
On Mon, Apr 20, 2020 at 12:39:33PM -0600, Kelsey Skunberg wrote:
> BugLink: https://bugs.launchpad.net/bugs/1871688
> SRU Justification
> (Following details are from the bug report)
> The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI
> sessions simultaneously. This is reproducible on systems that have multiple
> targets with initiators regularly connecting/disconnecting.
> This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is
> executed when a logout operation is underway.
> The iscsi target doesn't handle such events in a correct way: two or more
> threads may end up sleeping while waiting for the driver to close the remaining
> connections on the session. When the connections are closed, the driver wakes
> up only the first thread that will then proceed to destroy the session
> structure. The remaining threads are blocked there forever, waiting on a
> completion synchronization mechanism that doesn't exist in memory anymore
> because it has been freed by the first thread.
> Note that if the blocked threads are somehow forced to wake up, they will try
> to free the same iSCSI session structure destroyed by the first thread, causing
> double frees, memory corruptions, etc.
> The driver has been reorganized so the concurrent threads will set a flag in
> the session structure to notify the driver that the session should be
> destroyed; then, they wait for the driver to close the remaining connections.
> When the connections are all closed, the driver will wake up all the threads
> and will wait for the refcount of the iSCSI session structure to reach zero.
> When the last thread wakes up, the refcount is decreased to zero and the driver
> can proceed to destroy the session structure because no one is referencing it
> Bug reporter witnessed this happening on hundreds of Ubuntu 16.04.5 systems.
> States this is a regression, because this did not occur several years ago. No
> detailed records from that far back to determine exactly which kernel reporter
> was running that was not affected by this bug (Believes it was either 4.8.x or
> Attached in the bug report is the requested uname, version_signature, dmesg, and
> lspci from reporter's system. However, the reporter has seen this happen on a
> wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel
> CPUs, onboard storage and PCIe SAS cards, etc.
> This has been fixed in the upstream master branch, but it hasn't yet been
> backported to "-stable".
> These three commits should be backported:
> * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad
> * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796
> * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e
> This is reproducible on systems that have multiple targets with initiators
> regularly connecting/disconnecting by having multiple threads try and destroy
> iSCSI sessions simultaneiously.
> This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed
> when a logout operation is underway.
> [Regression Risk]
> Low, cherry picked from upstream with no changes.
> Verified applies cleanly to Bionic/master-next and Focal/master-next. Build
> tests pass.
> Maurizio Lombardi (3):
> scsi: target: remove boilerplate code
> scsi: target: fix hang when multiple threads try to destroy the same
> iscsi session
> scsi: target: iscsi: calling iscsit_stop_session() inside
> iscsit_close_session() has no effect
> drivers/target/iscsi/iscsi_target.c | 82 ++++++--------------
> drivers/target/iscsi/iscsi_target.h | 1 -
> drivers/target/iscsi/iscsi_target_configfs.c | 5 +-
> drivers/target/iscsi/iscsi_target_login.c | 5 +-
> include/target/iscsi/iscsi_target_core.h | 2 +-
> 5 files changed, 32 insertions(+), 63 deletions(-)
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
More information about the kernel-team