ACK/Cmnt: [focal:linux-azure][LP:#1894896][PATCH] hv_netvsc: Fix hibernation for mlx5 VF driver
Marcelo Henrique Cerri
marcelo.cerri at canonical.com
Wed Oct 28 12:05:05 UTC 2020
On Wed, Oct 28, 2020 at 09:43:11AM +0100, Stefan Bader wrote:
> On 27.10.20 18:18, Marcelo Henrique Cerri wrote:
> > From: Dexuan Cui <decui at microsoft.com>
> >
> > BugLink: https://bugs.launchpad.net/bugs/1894896
> >
> > mlx5_suspend()/resume() keep the network interface, so during hibernation
> > netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence
> > netvsc_resume() should call netvsc_vf_changed() to switch the data path
> > back to the VF after hibernation. Note: after we close and re-open the
> > vmbus channel of the netvsc NIC in netvsc_suspend() and netvsc_resume(),
> > the data path is implicitly switched to the netvsc NIC. Similarly,
> > netvsc_suspend() should not call netvsc_unregister_vf(), otherwise the VF
> > can no longer be used after hibernation.
> >
> > For mlx4, since the VF network interafce is explicitly destroyed and
> > re-created during hibernation (see mlx4_suspend()/resume()), hv_netvsc
> > already explicitly switches the data path from and to the VF automatically
> > via netvsc_register_vf() and netvsc_unregister_vf(), so mlx4 doesn't need
> > this fix. Note: mlx4 can still work with the fix because in
> > netvsc_suspend()/resume() ndev_ctx->vf_netdev is NULL for mlx4.
> >
> > Fixes: 0efeea5fb153 ("hv_netvsc: Add the support of hibernation")
> > Signed-off-by: Dexuan Cui <decui at microsoft.com>
> > Signed-off-by: Jakub Kicinski <kuba at kernel.org>
> > (cherry picked from commit 19162fd4063a3211843b997a454b505edb81d5ce)
> > Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri at canonical.com>
> Acked-by: Stefan Bader <stefan.bader at canonical.com>
> > ---
>
> Just two notes not related to the patch itself. One just for whoever applies it
> later. Maybe no longer an issue but I remember some issues with the cutoff hints
> when there are multiple of them. I would maybe manually delete things from the
> downloaded patch...
> Second is about the primary tasks on launchpad. This always relates to the
> current development series and is a moving target. Right now, there would be no
> source for azure there, so that task should be invalid. The reason for not
> leaving around any tasks in an open (new, confirmed, triaged, in progress) state
> is that this prevents the bug report from considered done. So it still shows up
> in bug searches and is counted as "active" bug. And there is already enough of
> those.
Thanks for the explanation, Stefan. What issue did you have with the
cutoff hints? Was the comment after the three dashed included to the
commit message?
>
> -Stefan
>
> >
> > Marcelo's comments:
> >
> > I was able to reproduce the problem with the current version from
> > focal-updates following the test instructions from the LP bug and I
> > was also able to confirm the fix solves the problem with a test
> > kernel:
> >
> > https://kernel.ubuntu.com/~mhcerri/azure/focal-linux-azure_lp1894896.tgz
> >
> > 5.8 already received the fix via an upstream stable update
> > (LP:#1894896) and hibernation is not supported in 4.15 linux-azure.
> >
> > Besides that the fix is a clean cherry-pick from upstream.
> >
> > ---
> > drivers/net/hyperv/netvsc_drv.c | 16 +++++++++++-----
> > 1 file changed, 11 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> > index 33d9d17ee0e0..be5ede2a6b94 100644
> > --- a/drivers/net/hyperv/netvsc_drv.c
> > +++ b/drivers/net/hyperv/netvsc_drv.c
> > @@ -2551,8 +2551,8 @@ static int netvsc_remove(struct hv_device *dev)
> > static int netvsc_suspend(struct hv_device *dev)
> > {
> > struct net_device_context *ndev_ctx;
> > - struct net_device *vf_netdev, *net;
> > struct netvsc_device *nvdev;
> > + struct net_device *net;
> > int ret;
> >
> > net = hv_get_drvdata(dev);
> > @@ -2568,10 +2568,6 @@ static int netvsc_suspend(struct hv_device *dev)
> > goto out;
> > }
> >
> > - vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
> > - if (vf_netdev)
> > - netvsc_unregister_vf(vf_netdev);
> > -
> > /* Save the current config info */
> > ndev_ctx->saved_netvsc_dev_info = netvsc_devinfo_get(nvdev);
> >
> > @@ -2587,6 +2583,7 @@ static int netvsc_resume(struct hv_device *dev)
> > struct net_device *net = hv_get_drvdata(dev);
> > struct net_device_context *net_device_ctx;
> > struct netvsc_device_info *device_info;
> > + struct net_device *vf_netdev;
> > int ret;
> >
> > rtnl_lock();
> > @@ -2599,6 +2596,15 @@ static int netvsc_resume(struct hv_device *dev)
> > netvsc_devinfo_put(device_info);
> > net_device_ctx->saved_netvsc_dev_info = NULL;
> >
> > + /* A NIC driver (e.g. mlx5) may keep the VF network interface across
> > + * hibernation, but here the data path is implicitly switched to the
> > + * netvsc NIC since the vmbus channel is closed and re-opened, so
> > + * netvsc_vf_changed() must be used to switch the data path to the VF.
> > + */
> > + vf_netdev = rtnl_dereference(net_device_ctx->vf_netdev);
> > + if (vf_netdev && netvsc_vf_changed(vf_netdev) != NOTIFY_OK)
> > + ret = -EINVAL;
> > +
> > rtnl_unlock();
> >
> > return ret;
> >
>
>
--
Regards,
Marcelo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20201028/20922ce6/attachment.sig>
More information about the kernel-team
mailing list