APPLIED: [SRU Xenial][PATCH 1/2] UBUNTU: SAUCE: (no-up) virtio-scsi: Decrement reqs counter before SCSI command requeue

Khaled Elmously khalid.elmously at canonical.com
Mon Oct 22 07:20:13 UTC 2018


Only patch 1/2 has been applied

On 2018-10-16 12:38:19 , Mauricio Faria de Oliveira wrote:
> BugLink: https://bugs.launchpad.net/bugs/1798110
> 
> The 'reqs' counter is incremented in the SCSI .queuecommand() path,
> right before virtscsi_queuecommand() is called, in either
>  - virtscsi_queuecommand_single(), or
>  - virtscsi_queuecommand_multi(), via virtscsi_pick_vq{_mq}().
> 
> And it's decremented only in the SCSI command completion callback
> (after the command is successfully queued and completed by adapter):
>  - virtscsi_complete_cmd().
> 
> This allows for the counter to be incremented but _not_ decremented
> if virtscsi_queuecommand() gets an error to add/kick the command to
> the virtio ring (i.e., virtscsi_kick_cmd() fails with not 0 nor EIO).
> 
>     static virtscsi_queuecommand(...)
>     {
>             ...
>             ret = virtscsi_kick_cmd(...)
>             if (ret == -EIO) {
>                     ...
>                     virtscsi_complete_cmd(vscsi, cmd);
>                     ...
>             } else if (ret != 0) {
>                     return SCSI_MLQUEUE_HOST_BUSY;
>             }
>             return 0;
>     }
> 
> In that case, the return code SCSI_MLQUEUE_HOST_BUSY causes the SCSI
> command to be requeued by the SCSI layer, which sends it again later
> in the .queuecommand() path -- incrementing the reqs counter _again_.
> 
> This may happen many times for the same SCSI command, depending on
> the virtio ring condition/implementation, so the reqs counter will
> never return to zero (it's decremented only once in the completion
> callback). And it may happen for (m)any SCSI commands in this path.
> 
> Unfortunately.. that causes a problem with a downstream/SAUCE patch
> for Xenial, which uses the reqs counter to sync with the completion
> callback: commit f1f609d8015e ("UBUNTU: SAUCE: (no-up) virtio-scsi:
> Fix race in target free"), and waits for the value to become zero.
> 
> This problem plus that patch prevent the SCSI target removal from
> finishing, eventually causing a CPU soft lockup on another CPU that
> is waiting for some kernel resource that is/remains locked in the
> stack chain of this CPU.
> 
> This has been verified 1) with a synthetic test case with QEMU+GDB
> that fakes the number of available elements in virtio ring for one
> time (harmless), so to force the SCSI command to be requeued, then
> uses QEMU monitor to remove the virtio-scsi target.
> 
> _AND_ 2) with the test-case reported by the customer (a for-loop on
> a cloud instance that repeatedly mounts the virtio-scsi drive, copy
> data out of it, unmount it, then detach the virtio-scsi drive).
> (Here, the problem usually happens in the 1st or 2nd iteration, but
> with the patch it has run for 35 iterations without any problems).
> 
> Upstream has done away with the reqs counter (originally used only
> to check if any requests were still active, for steering;  not for
> our sync purposes).  Instead of trying to find an alternative sync
> way for now let's just fix the behavior which we know is incorrect.
> 
> Signed-off-by: Mauricio Faria de Oliveira <mfo at canonical.com>
> Tested-by: Mauricio Faria de Oliveira <mfo at canonical.com>
> Tested-by: David Coronel <david.coronel at canonical.com>
> ---
>  drivers/scsi/virtio_scsi.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index b4a41d581021..572722e86bef 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -571,6 +571,10 @@ static int virtscsi_queuecommand(struct virtio_scsi *vscsi,
>  		virtscsi_complete_cmd(vscsi, cmd);
>  		spin_unlock_irqrestore(&req_vq->vq_lock, flags);
>  	} else if (ret != 0) {
> +		/* The SCSI command requeue will increment 'tgt->reqs' again. */
> +		struct virtio_scsi_target_state *tgt =
> +				scsi_target(sc->device)->hostdata;
> +		atomic_dec(&tgt->reqs);
>  		return SCSI_MLQUEUE_HOST_BUSY;
>  	}
>  	return 0;
> -- 
> 2.17.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team




More information about the kernel-team mailing list