APPLIED/cmt: [PATCH 0/1][SRU X] UBUNTU: SAUCE: bnxt_en_bpo: Fix TX timeout during netpoll

Stefan Bader stefan.bader at canonical.com
Thu Mar 14 08:58:21 UTC 2019


On 03.03.19 23:45, Khaled Elmously wrote:
> Thanks for the re-send Nivedita :)
> 
> On 2019-03-04 00:36:48 , Nivedita Singhvi wrote:
>> BugLink: http://bugs.launchpad.net/bugs/1814095

The patch itself was missing the BugLink line. I fixed it up now while cranking.

-Stefan

>>
>>
>> [Impact]
>>
>> The bnxt_en_bpo driver experienced tx timeouts causing the system to experience
>> network stalls and fail to send data and heartbeat packets.
>>
>> The following 25Gb Broadcom NIC error was seen on Xenial running the
>> 4.4.0-141-generic kernel on an amd64 host seeing moderate-heavy network
>> traffic (just once):
>>
>> * The bnxt_en_po driver froze on a "TX timed out" error and triggered the
>>   Netdev Watchdog timer under load.
>>
>> * From kernel log:
>>   "NETDEV WATCHDOG: eno2d1 (bnxt_en_bpo): transmit queue 0 timed out"
>>   See attached kern.log excerpt file for full excerpt of error log.
>>
>> * Release = Xenial
>>   Kernel = 4.4.0-141-generic #167
>>   eno2d1 = Product Name: Broadcom Adv. Dual 25Gb Ethernet
>>
>> * This caused the driver to reset in order to recover:
>>
>>   "bnxt_en_bpo 0000:19:00.1 eno2d1: TX timeout detected, starting reset task!"
>>
>>   driver: bnxt_en_bpo
>>   version: 1.8.1
>>   source: ubuntu/bnxt/bnxt.c: bnxt_tx_timeout()
>>
>> * The loss of connectivity and softirq stall caused other cascading failures
>>   on the system.
>>
>> * The bnxt_en_po driver is the imported Broadcom driver pulled in to support
>>   newer Broadcom HW (specific boards) while the bnx_en module continues to
>>   support the older HW. The current Linux upstream driver does not compile
>>   easily with the 4.4 kernel (too many changes).
>>
>> * This upstream and bnxt_en driver fix is a likely solution:
>>    "bnxt_en: Fix TX timeout during netpoll"
>>    commit: 73f21c653f930f438d53eed29b5e4c65c8a0f906
>>
>>   This fix has not been applied to the bnxt_en_po driver version, but review of
>>   the code indicates that it is susceptible to the bug, and the fix would be
>>   reasonable.
>>
>>
>> [Test Case]
>>
>> * Unfortunately, this is not easy to reproduce. Also, it is only seen on
>>   4.4 kernels with newer Broadcom NICs supported by the bnxt_en_bpo driver.
>>
>>
>> [Regression Potential]
>>
>> * The patch is restricted to the bpo driver, with very constrained scope
>>   - just the newest Broadcom NICs being used by the Xenial 4.4 kernel (as
>>   opposed to the hwe 4.15 etc. kernels, which would have the in-tree fixed
>>   driver).
>>
>> * The patch is very small and backport is fairly minimal and simple.
>>
>> * The fix has been running on the in-tree driver in upstream mainline as well
>>   as the Ubuntu Linux in-tree driver, although the Broadcom driver has a lot of
>>   lower level code that is different, this piece is still the same.
>>
>>
>> Michael Chan (1):
>>   The current netpoll implementation in the bnxt_en driver has problems
>>     that may miss TX completion events.  bnxt_poll_work() in effect is
>>     only handling at most 1 TX packet before exiting.  In addition,
>>     there may be in flight TX completions that ->poll() may miss even
>>     after we fix bnxt_poll_work() to handle all visible TX completions.
>>     netpoll may not call ->poll() again and HW may not generate IRQ
>>     because the driver does not ARM the IRQ when the budget (0 for
>>     netpoll) is reached.
>>
>>  ubuntu/bnxt/bnxt.c | 13 ++++++++++---
>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>
>> -- 
>> 2.17.1
>>
>>
>> -- 
>> kernel-team mailing list
>> kernel-team at lists.ubuntu.com
>> https://lists.ubuntu.com/mailman/listinfo/kernel-team
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20190314/21daade5/attachment.sig>


More information about the kernel-team mailing list