[PATCH][SRU][X] tcp: refine memory limit test in tcp_fragment()

Tyler Hicks tyhicks at canonical.com
Mon Jun 24 19:19:07 UTC 2019

From: Eric Dumazet <edumazet at google.com>

tcp_fragment() might be called for skbs in the write queue.

Memory limits might have been exceeded because tcp_sendmsg() only
checks limits at full skb (64KB) boundaries.

Therefore, we need to make sure tcp_fragment() wont punish applications
that might have setup very low SO_SNDBUF values.

Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: Eric Dumazet <edumazet at google.com>
Reported-by: Christoph Paasch <cpaasch at apple.com>
Tested-by: Christoph Paasch <cpaasch at apple.com>
Signed-off-by: David S. Miller <davem at davemloft.net>


(backported from commit b6653b3629e5b88202be3c9abc44713973f5c4b4)
[tyhicks: Don't enforce the limit on the skb that tcp_send_head points
 as that skb has never been sent out. In newer kernels containing commit
 75c119afe14f ("tcp: implement rb-tree based retransmit queue"), where
 there the retransmission queue is separate from the write queue, this
 skb would be in the write queue.
 With the modified check in this backported patch, we run the risk of
 enforcing the memory limit on an skb that is after tcp_send_head in the
 queue yet has never been sent out. However, an inspection of all
 tcp_fragment() call sites finds that this shouldn't occur and the limit
 will only be enforced on skbs that are up for retransmission.]
Signed-off-by: Tyler Hicks <tyhicks at canonical.com>

I've successfully tested this patch using a slightly modified version of
a packetdrill test that was sent to the netdev list. Without this kernel
change, the test hangs. The test successfully completes with this kernel

 net/ipv4/tcp_output.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index ede265fbf7ba..719d2cc8770c 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1163,7 +1163,8 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
 	if (nsize < 0)
 		nsize = 0;
-	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) {
+	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf &&
+		     skb != tcp_send_head(sk))) {
 		return -ENOMEM;

More information about the kernel-team mailing list