[Precise] [PATCH] SRU: zerocopy LSO segmenting fixed

Anton Nayshtut anton at swortex.com
Sun Jun 15 09:15:12 UTC 2014


BugLink: https://bugs.launchpad.net/bugs/1328973

[SRU Justification]

[Setup]
 - 2 or more QEMU Guest VMs sharing the the same host
 - the Guests are communicating using virtio-net-pci devices
 - vhost-net is enabled

[Explanation]
If one Guest VM sends GSO packets to another while GRO is disabled for receiver,
so these packets are segmented by net/core.
In this case, if zero-copy is enabled in vhost-net, the GSO packets TX
completion is reported to userspace as before the TX is actually done.
The vhost-net's zero-copy mechanism is enabled by default since v3.8-rc1
(f9611c43).

[Impact]
Incorrect/junk data sent in case the transmitting Guest OS re-uses/frees the TX
buffer immediately upon TX completion.

[Test Case]
Windows 2008R2 Guest VMs running MS HCK Offload LSO test.
NOTE1: GRO is always disabled in this case because it's not supported by Windows
Guest virtio-net-pci drivers.
NOTE2: MS HCK re-uses the GSO (LSO) buffers, so it reproduces the issue every
time.

[Note]
This bug has been fixed in v3.14-rc7 by 1fd819ec.
The fix actually disables zero copy for this case since it forces unconditional
fragments copying, but it resolves the issue mentioned.

OriginalAuthor: Igor Royzis <igorr at swortex.com>
Signed-off-by: Igor Royzis <igorr at swortex.com>
Signed-off-by: Anton Nayshtut <anton at swortex.com>
---
 include/linux/skbuff.h |    1 +
 net/core/skbuff.c      |   13 +++++++++++++
 2 files changed, 14 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index e27c8f6..4848180 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -267,6 +267,7 @@ struct skb_shared_info {
 	struct sk_buff	*frag_list;
 	struct skb_shared_hwtstamps hwtstamps;
 	__be32          ip6_frag_id;
+	struct sk_buff	*zcopy_src;
 
 	/*
 	 * Warning : all fields before dataref are cleared in __alloc_skb()
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index b2cd9a4..7023979 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -625,14 +625,18 @@ EXPORT_SYMBOL(__kfree_skb);
  */
 void kfree_skb(struct sk_buff *skb)
 {
+	struct sk_buff *zcopy_src;
 	if (unlikely(!skb))
 		return;
 	if (likely(atomic_read(&skb->users) == 1))
 		smp_rmb();
 	else if (likely(!atomic_dec_and_test(&skb->users)))
 		return;
+	zcopy_src = skb_shinfo(skb)->zcopy_src;
 	trace_kfree_skb(skb, __builtin_return_address(0));
 	__kfree_skb(skb);
+	if (unlikely(zcopy_src))
+		kfree_skb(zcopy_src);
 }
 EXPORT_SYMBOL(kfree_skb);
 
@@ -677,14 +681,18 @@ EXPORT_SYMBOL(skb_tx_error);
  */
 void consume_skb(struct sk_buff *skb)
 {
+	struct sk_buff *zcopy_src;
 	if (unlikely(!skb))
 		return;
 	if (likely(atomic_read(&skb->users) == 1))
 		smp_rmb();
 	else if (likely(!atomic_dec_and_test(&skb->users)))
 		return;
+	zcopy_src = skb_shinfo(skb)->zcopy_src;
 	trace_consume_skb(skb);
 	__kfree_skb(skb);
+	if (unlikely(zcopy_src))
+		kfree_skb(zcopy_src);
 }
 EXPORT_SYMBOL(consume_skb);
 
@@ -2847,6 +2855,11 @@ struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features)
 						 nskb->data - tnl_hlen,
 						 doffset + tnl_hlen);
 
+		if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
+			skb_shinfo(nskb)->zcopy_src = skb;
+			atomic_inc(&skb->users);
+		}
+
 		if (fskb != skb_shinfo(skb)->frag_list)
 			goto perform_csum_check;
 
-- 
1.7.9.5





More information about the kernel-team mailing list