ACK/cmnt: [SRU Xenial][PATCH 0/2] Improve our SAUCE for virtio-scsi reqs counter (fix CPU soft lockup)

Tue Oct 16 16:42:23 UTC 2018

On 2018-10-16 12:38:18 , Mauricio Faria de Oliveira wrote:
> BugLink: https://bugs.launchpad.net/bugs/1798110
> 
> [Impact] 
> 
>  * Detaching virtio-scsi disk in Xenial guest can cause
>    CPU soft lockup in guest (and take 100% CPU in host).
> 
>  * It may prevent further progress on other tasks that
>    depend on resources locked earlier in the SCSI target
>    removal stack, and/or impact other SCSI functionality.
> 
>  * The fix resolves a corner case in the requests counter
>    in the virtio SCSI target, which impacts a downstream
>    (SAUCE) patch in the virtio-scsi target removal handler
>    that depends on the requests counter value to be zero.
> 
> [Test Case]
> 
>  * See LP #1798110 (this bug)'s comment #3 (too long for
>    this section -- synthetic case with GDB+QEMU) and
>    comment #4 (organic test case in cloud instance).
> 
> [Regression Potential] 
> 
>  * It seem low -- this only affects the SCSI command requeue
>    path with regards to the reference counter, which is only
>    used with real chance of problems in our downstream patch
>    (which is now passing this testcase).
> 
>  * The other less serious issue would be decrementing it to
>    a negative / < 0 value, which is not possible with this
>    driver logic (see commit message), because the reqs counter
>    is always incremented before calling virtscsi_queuecommand(),
>    where this decrement operation is inserted.
> 
> 
> Mauricio Faria de Oliveira (2):
>   UBUNTU: SAUCE: (no-up) virtio-scsi: Decrement reqs counter before SCSI
>     command requeue
>   UBUNTU: SAUCE: (no-up) virtio-scsi: (cosmetic) reestablish
>     virtscsi_pick_vq_mq() signature
> 
>  drivers/scsi/virtio_scsi.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 

 - Maybe I'm misunderstanding what you mean by 'no-up'. I assumed it means "no operation", as in, no change in the resulting binary (e.g., whitespace changes, comment changes, casting etc.). Both patches seem to be "op" changes
 - The second patch doesn't actually appear to be related to the 'reqs' counter or any related soft-lockups
 - It seems strange to me that you'd have a loop that repeatedly decrements a counter because you know it will be incremented again when about to retry. I would think it makes more sense to increment the counter only once when queuing the command and decrement only once on success or "giving up" - though I assume you did it this way for a reason?

None of the above points are major enough for me to NAK this patchset, but I wanted to bring them to your attention (in case you missed something).

Acked-by: Khalid Elmously <khalid.elmously at canonical.com>