APPLIED: [focal:linux-azure][PATCH 1/1] Drivers: hv: vmbus: hibernation: do not hang forever in vmbus_bus_resume()

Ian May ian.may at canonical.com
Mon Oct 26 18:30:36 UTC 2020


Applied to Azure Focal/master-next

Thanks!
Ian

On 2020-10-23 17:39:43 , Marcelo Henrique Cerri wrote:
> From: Dexuan Cui <decui at microsoft.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1894895
> 
> After we Stop and later Start a VM that uses Accelerated Networking (NIC
> SR-IOV), currently the VF vmbus device's Instance GUID can change, so after
> vmbus_bus_resume() -> vmbus_request_offers(), vmbus_onoffer() can not find
> the original vmbus channel of the VF, and hence we can't complete()
> vmbus_connection.ready_for_resume_event in check_ready_for_resume_event(),
> and the VM hangs in vmbus_bus_resume() forever.
> 
> Fix the issue by adding a timeout, so the resuming can still succeed, and
> the saved state is not lost, and according to my test, the user can disable
> Accelerated Networking and then will be able to SSH into the VM for
> further recovery. Also prevent the VM in question from suspending again.
> 
> The host will be fixed so in future the Instance GUID will stay the same
> across hibernation.
> 
> Fixes: d8bd2d442bb2 ("Drivers: hv: vmbus: Resume after fixing up old primary channels")
> Signed-off-by: Dexuan Cui <decui at microsoft.com>
> Reviewed-by: Michael Kelley <mikelley at microsoft.com>
> Link: https://lore.kernel.org/r/20200905025555.45614-1-decui@microsoft.com
> Signed-off-by: Wei Liu <wei.liu at kernel.org>
> (cherry picked from commit 19873eec7e13fda140a0ebc75d6664e57c00bfb1)
> Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri at canonical.com>
> ---
> 
> That's a clean cherry-pick from upstream and I was able to reproduce
> the issue and confirm it fixes the problem as described in the bug.
> 
> 5.8 doesn't need the fix because it was already applied via upstream
> stable updates.
> 
> ---
>  drivers/hv/vmbus_drv.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 42fa92b005df..847c652ad00e 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -2230,7 +2230,10 @@ static int vmbus_bus_suspend(struct device *dev)
>  	if (atomic_read(&vmbus_connection.nr_chan_close_on_suspend) > 0)
>  		wait_for_completion(&vmbus_connection.ready_for_suspend_event);
>  
> -	WARN_ON(atomic_read(&vmbus_connection.nr_chan_fixup_on_resume) != 0);
> +	if (atomic_read(&vmbus_connection.nr_chan_fixup_on_resume) != 0) {
> +		pr_err("Can not suspend due to a previous failed resuming\n");
> +		return -EBUSY;
> +	}
>  
>  	mutex_lock(&vmbus_connection.channel_mutex);
>  
> @@ -2304,7 +2307,9 @@ static int vmbus_bus_resume(struct device *dev)
>  
>  	vmbus_request_offers();
>  
> -	wait_for_completion(&vmbus_connection.ready_for_resume_event);
> +	if (wait_for_completion_timeout(
> +		&vmbus_connection.ready_for_resume_event, 10 * HZ) == 0)
> +		pr_err("Some vmbus device is missing after suspending?\n");
>  
>  	/* Reset the event for the next suspend. */
>  	reinit_completion(&vmbus_connection.ready_for_suspend_event);
> -- 
> 2.25.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team



More information about the kernel-team mailing list