[PATCH 0/1] [SRU][Focal] devlink: don't do reporter recovery if the state is healthy

Jeff Lane jeffrey.lane at canonical.com
Wed Feb 17 21:51:27 UTC 2021


BugLink: https://bugs.launchpad.net/bugs/1915403

[Impact]
Currently in focal, devices reporter recovery is enabled even if state is
healthy.

[Fix]
402818205c9e devlink: don't do reporter recovery if the state is healthy
this upstream commit from kernel v5.5-rc1 which is cleanly applied on focal
tree.
the commit prevents reporter recovery when device in healthy state.
when applied, issuing
# devlink health recover pci/0000:05:00.0 reporter fw_fatal
on healthy state reporter return successfully, but dmesg is clean and recover
counter do not change.

[Test case]

1)
display devlink health status
# devlink health show  pci/0000:05:00.0 reporter fw_fatal
pci/0000:05:00.0:
  reporter fw_fatal
    state healthy error 0 recover 0 grace_period 1200000 auto_recover true
2)
perform reporter recovery using devlink,
# devlink health recover pci/0000:05:00.0 reporter fw_fatal

3)see that recovery was performed.
# dmesg
[776733.438708] mlx5_core 0000:05:00.0: mlx5_health_try_recover:316:(pid
563178): handling bad device here
[776733.438717] mlx5_core 0000:05:00.0: mlx5_handle_bad_state:278:(pid 563178):
Expected to see disabled
 NIC but it is full driver
[776735.591522] mlx5_core 0000:05:00.0: mlx5_health_try_recover:328:(pid
563178): starting health recovery flow
...
# devlink health show  pci/0000:05:00.0 reporter fw_fatal
pci/0000:05:00.0:
  reporter fw_fatal
    state healthy error 0 recover 1 grace_period 1200000 auto_recover true

[Regression Potential]
Very small as it is a very minor change, also this patch has been tested
internally on upstream setups for a while and no degradation has been found.
One obvious change is that a user cannot force devlink recovery when state is
healthy but I'm not aware of such use case.

Jiri Pirko (1):
  devlink: don't do reporter recovery if the state is healthy

 net/core/devlink.c | 3 +++
 1 file changed, 3 insertions(+)

-- 
2.17.1




More information about the kernel-team mailing list