[SRU][J:linux-bluefield][PATCH v1 1/1] UBUNTU: SAUCE: mlxbf-gige: Vitesse PHY stuck in a bad state during reboot test

Asmaa Mnebhi asmaa at nvidia.com
Mon Apr 29 20:01:15 UTC 2024


BugLink: https://bugs.launchpad.net/bugs/2064163

During the QA reboot test, the BF3 Vitesse PHY gets stuck
in a bad state, resulting in no ip provisioning. The only
way to recover is to powercycle. We might have found a
software workaround to avoid getting in this state in the
first place: suspend the PHY during graceful shutdown.
Suspend the PHY = Power down = set bit 11 to 1 in reg 0
of the PHY. This WA passed 1800 reboots on QA's setup.

Signed-off-by: Asmaa Mnebhi <asmaa at nvidia.com>
Reviewed-by: David Thompson <davthompson at nvidia.com>
---
 .../ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c  | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
index 56235cef5cd6..ccf48640999d 100644
--- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
+++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
@@ -539,6 +539,19 @@ static void mlxbf_gige_shutdown(struct platform_device *pdev)
 {
 	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
 
+	/* The Vitesse PHY is stuck in a non working state during the reboot test.
+	 * The only way to recover was to toggle the PHY hard reset signal.
+	 * Unfortunately, software does not have a way to toggle the PHY hard reset.
+	 * We found that suspending the PHY before reboot might prevent the PHY
+	 * from entering this stuck state.
+	 *
+	 * Note that dev_close() calls phy_suspend() but the vitesse PHY does
+	 * not define the suspend function. This is why we need to explicitly call
+	 * it here.
+	 */
+	if (priv->hw_version == MLXBF_GIGE_BLUEFIELD3)
+		genphy_suspend(priv->netdev->phydev);
+
 	rtnl_lock();
 	netif_device_detach(priv->netdev);
 
-- 
2.30.1




More information about the kernel-team mailing list