ACK/cmnt: [SRU Xenial][PATCH 0/2] Improve our SAUCE for virtio-scsi reqs counter (fix CPU soft lockup)
Khaled Elmously
khalid.elmously at canonical.com
Tue Oct 16 16:42:23 UTC 2018
On 2018-10-16 12:38:18 , Mauricio Faria de Oliveira wrote:
> BugLink: https://bugs.launchpad.net/bugs/1798110
>
> [Impact]
>
> * Detaching virtio-scsi disk in Xenial guest can cause
> CPU soft lockup in guest (and take 100% CPU in host).
>
> * It may prevent further progress on other tasks that
> depend on resources locked earlier in the SCSI target
> removal stack, and/or impact other SCSI functionality.
>
> * The fix resolves a corner case in the requests counter
> in the virtio SCSI target, which impacts a downstream
> (SAUCE) patch in the virtio-scsi target removal handler
> that depends on the requests counter value to be zero.
>
> [Test Case]
>
> * See LP #1798110 (this bug)'s comment #3 (too long for
> this section -- synthetic case with GDB+QEMU) and
> comment #4 (organic test case in cloud instance).
>
> [Regression Potential]
>
> * It seem low -- this only affects the SCSI command requeue
> path with regards to the reference counter, which is only
> used with real chance of problems in our downstream patch
> (which is now passing this testcase).
>
> * The other less serious issue would be decrementing it to
> a negative / < 0 value, which is not possible with this
> driver logic (see commit message), because the reqs counter
> is always incremented before calling virtscsi_queuecommand(),
> where this decrement operation is inserted.
>
>
> Mauricio Faria de Oliveira (2):
> UBUNTU: SAUCE: (no-up) virtio-scsi: Decrement reqs counter before SCSI
> command requeue
> UBUNTU: SAUCE: (no-up) virtio-scsi: (cosmetic) reestablish
> virtscsi_pick_vq_mq() signature
>
> drivers/scsi/virtio_scsi.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
- Maybe I'm misunderstanding what you mean by 'no-up'. I assumed it means "no operation", as in, no change in the resulting binary (e.g., whitespace changes, comment changes, casting etc.). Both patches seem to be "op" changes
- The second patch doesn't actually appear to be related to the 'reqs' counter or any related soft-lockups
- It seems strange to me that you'd have a loop that repeatedly decrements a counter because you know it will be incremented again when about to retry. I would think it makes more sense to increment the counter only once when queuing the command and decrement only once on success or "giving up" - though I assume you did it this way for a reason?
None of the above points are major enough for me to NAK this patchset, but I wanted to bring them to your attention (in case you missed something).
Acked-by: Khalid Elmously <khalid.elmously at canonical.com>
More information about the kernel-team
mailing list