[Acked] [PATCH] [SCSI] scsi_remove_target: fix softlockup regression on hot remove

Andy Whitcroft apw at canonical.com
Fri Oct 19 12:26:36 UTC 2012


On Fri, Oct 19, 2012 at 12:48:57PM +0200, Louis Bouchard wrote:
> From: Dan Williams <djbw at fb.com>
> 
> John reports:
>  BUG: soft lockup - CPU#2 stuck for 23s! [kworker/u:8:2202]
>  [..]
>  Call Trace:
>   [<ffffffff8141782a>] scsi_remove_target+0xda/0x1f0
>   [<ffffffff81421de5>] sas_rphy_remove+0x55/0x60
>   [<ffffffff81421e01>] sas_rphy_delete+0x11/0x20
>   [<ffffffff81421e35>] sas_port_delete+0x25/0x160
>   [<ffffffff814549a3>] mptsas_del_end_device+0x183/0x270
> 
> ...introduced by commit 3b661a9 "[SCSI] fix hot unplug vs async scan race".
> 
> Don't restart lookup of more stargets in the multi-target case, just
> arrange to traverse the list once, on the assumption that new targets
> are always added at the end.  There is no guarantee that the target will
> change state in scsi_target_reap() so we can end up spinning if we
> restart.
> 
> Cc: <stable at vger.kernel.org>
> Acked-by: Jack Wang <jack_wang at usish.com>
> LKML-Reference: <CAEhu1-6wq1YsNiscGMwP4ud0Q+MrViRzv=kcWCQSBNc8c68N5Q at mail.gmail.com>
> Reported-by: John Drescher <drescherjm at gmail.com>
> Tested-by: John Drescher <drescherjm at gmail.com>
> Signed-off-by: Dan Williams <djbw at fb.com>
> Signed-off-by: James Bottomley <JBottomley at Parallels.com>
> (cherry picked from commit bc3f02a795d3b4faa99d37390174be2a75d091bd)
> 
> BugLink: http://bugs.launchpad.net/bugs/1056746
> 
> Signed-off-by: Louis Bouchard <louis.bouchard at canonical.com>
> ---
>  drivers/scsi/scsi_sysfs.c |   30 ++++++++++++++----------------
>  1 file changed, 14 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index bb7c482..08d48a3 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -1023,33 +1023,31 @@ static void __scsi_remove_target(struct scsi_target *starget)
>  void scsi_remove_target(struct device *dev)
>  {
>  	struct Scsi_Host *shost = dev_to_shost(dev->parent);
> -	struct scsi_target *starget, *found;
> +	struct scsi_target *starget, *last = NULL;
>  	unsigned long flags;
>  
> - restart:
> -	found = NULL;
> +	/* remove targets being careful to lookup next entry before
> +	 * deleting the last
> +	 */
>  	spin_lock_irqsave(shost->host_lock, flags);
>  	list_for_each_entry(starget, &shost->__targets, siblings) {
>  		if (starget->state == STARGET_DEL)
>  			continue;
>  		if (starget->dev.parent == dev || &starget->dev == dev) {
> -			found = starget;
> -			found->reap_ref++;
> -			break;
> +			/* assuming new targets arrive at the end */
> +			starget->reap_ref++;
> +			spin_unlock_irqrestore(shost->host_lock, flags);
> +			if (last)
> +				scsi_target_reap(last);
> +			last = starget;
> +			__scsi_remove_target(starget);
> +			spin_lock_irqsave(shost->host_lock, flags);
>  		}
>  	}
>  	spin_unlock_irqrestore(shost->host_lock, flags);
>  
> -	if (found) {
> -		__scsi_remove_target(found);
> -		scsi_target_reap(found);
> -		/* in the case where @dev has multiple starget children,
> -		 * continue removing.
> -		 *
> -		 * FIXME: does such a case exist?
> -		 */
> -		goto restart;
> -	}
> +	if (last)
> +		scsi_target_reap(last);
>  }
>  EXPORT_SYMBOL(scsi_remove_target);

Seems to do what is claimed.  Testing also shows good results.

Acked-by: Andy Whitcroft <apw at canonical.com>

-apw




More information about the kernel-team mailing list