ACK: [SRU B/C/D][SRU X][Unstable][PATCH 0/1] bnx2x: avoid 100% cpu utilization from ptp routine

Kleber Souza kleber.souza at canonical.com
Wed Jul 10 16:04:40 UTC 2019


On 03.07.19 21:17, Guilherme G. Piccoli wrote:
> BugLink: https://bugs.launchpad.net/bugs/1832082
> 
> [Impact]
> 
> * The PTP feature in bnx2x driver is implemented in a way that if the NIC
> firmware takes some time to perform the timestamping - which is observed as a
> bad register read in bnx2x_ptp_task() - then the ptp worker function will
> reschedule itself indefinitely until the value read from the register is
> meaningful. With that behavior, if an userspace tool request a bad configured
> RX filter to bnx2x (or if NIC firmware has any other issue in timestamping),
> the function bnx2x_ptp_task() will be rescheduled forever and cause a unbound
> resource consumption. This manifests as a kworker thread consuming 100% of CPU.
> 
> * The dmesg log will show the following message regarding other packets being
> skipped on timestamp routine due to a packet getting stuck in the timestamping
> "pipeline":
> 
> "bnx2x: [bnx2x_start_xmit:3862(eno4)]The device supports only a single
> outstanding packet to timestamp, this packet will not be timestamped"
> 
> Also, by using ftrace user can notice that function bnx2x_ptp_task() is being
> called a lot, and by enabling bnx2x PTP debugging log (ethtool -s <iface> msglvl
> 16777216) it's possible to observe the following message flooding the kernel
> log:
> 
> "bnx2x: [bnx2x_ptp_task:15242(eno4)]There is no valid Tx timestamp yet"
> 
> * The patch proposed in this SRU request is accepted upstream and is available
> currently (2019-07-03) in David Miller's linux-net tree:
> git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=3c91f25c2f72
> Besides fixing the issue, it also adds an ethtool statistics for accounting the
> ptp errors and reduces message flooding in case of errors.
> 
> 
> [Test case]
> 
> Reproducing the problem is not difficult; we've used chrony in Bionic to trigger
> the problem. The steps are:
> 
> a) Install chrony on Bionic in a system with working NIC managed by bnx2x;
> 
> b) Edit chrony configuration and add: "hwtimestamp *" to the top of its conf
> file;
> 
> c) Restart chrony service
> 
> Check dmesg for the "[...]single outstanding packet" message and the overall CPU
> workload using a tool like "top" to observe a kthread consuming 100% of CPU.
> 
> 
> [Regression potential]
> 
> The patch scope is restricted to bnx2x ptp handler, and was validated by the
> driver maintainer. If there's any possibility of regressions, we believe the
> worst would be an issue affecting the packet timestamping, not messing with the
> regular xmit path for the driver.
> 
> Guilherme G. Piccoli (1):
>   bnx2x: Prevent ptp_task to be rescheduled indefinitely
> 
>  .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c   |  5 ++-
>  .../ethernet/broadcom/bnx2x/bnx2x_ethtool.c   |  4 ++-
>  .../net/ethernet/broadcom/bnx2x/bnx2x_main.c  | 33 ++++++++++++++-----
>  .../net/ethernet/broadcom/bnx2x/bnx2x_stats.h |  3 ++
>  4 files changed, 34 insertions(+), 11 deletions(-)
> 

Acked-by: Kleber Sacilotto de Souza <kleber.souza at canonical.com>




More information about the kernel-team mailing list