[SRU][J][PATCH v1 0/1] tcp: fix forever orphan socket caused by tcp_abort

Stav Aviram saviram at nvidia.com
Thu Jun 19 17:39:05 UTC 2025


>From d41ec707dbe3fb3fa5b124a31fb9a8fd401fb1b3 Mon Sep 17 00:00:00 2001
Message-Id: <cover.1750244904.git.saviram at nvidia.com>
From: Stav Aviram <saviram at nvidia.com>
Date: Wed, 18 Jun 2025 14:08:24 +0300
To: kernel-team at lists.ubuntu.com
Subject: [SRU][J][PATCH v1 0/1] tcp: fix forever orphan socket caused by tcp_abort

BugLink: https://bugs.launchpad.net/bugs/2114965

SRU Justification:

[Impact]
In BFB version DOCA_2.6.0_BSP_4.6.0_Ubuntu_22.04-2.20240114, container
deletion via removal of its kubelet YAML from /etc/kubelet.d sometimes
fails to complete. The process waits for the container to disappear from
crictl ps, but the container remains in Running state indefinitely. This
behavior is seen with container version 2.dev.50 and FW 32.40.0324.
The issue appears to stem from a kernel bug affecting orphaned TCP
sockets stuck in a zero-window state. These sockets are not closed and
timers are not rescheduled, leading to "forever orphan" behavior that
prevents resource cleanup.

[Fix]
Backporting the upstream commit:
bac76cf89816 ("tcp: fix forever orphan socket caused by tcp_abort")
This commit removes a conditional check on SOCK_DEAD in tcp_abort,
allowing proper closure of orphaned sockets and preventing indefinite
stalling. Backporting is needed as the error handling and logging
methods differ from the original upstream code.

[Test Case]
Compile tested on BF 5.15.
Further testing includes reproducing the issue by removing the pod's
YAML from /etc/kubelet.d and monitoring container termination using
crictl ps.  With the patch applied, the container should no longer
remain stuck in Running state.

[Regression Potential]
The patch targets a specific edge case in TCP socket handling, and after
backporting, it is as close as possible to the original upstream commit.
However, since the change removes a check that previously avoided
closing SOCK_DEAD sockets, there's a small risk if other kernel paths
still rely on the earlier behavior. This could theoretically lead to
unexpected side effects in force-close logic if assumptions about socket
state are violated. Also, the backport is not an absolute match for the
original commit, and so there's a possibility for unexpected behavior in
edge cases related to socket teardown.

Xueming Feng (1):
  tcp: fix forever orphan socket caused by tcp_abort

 net/ipv4/tcp.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

--
2.34.1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20250619/7e683a4d/attachment.html>


More information about the kernel-team mailing list