[Precise] [PATCH] SRU: zerocopy LSO segmenting fixed

Mon Jun 16 12:50:53 UTC 2014

On 06/15/2014 03:15 AM, Anton Nayshtut wrote:
> BugLink: https://bugs.launchpad.net/bugs/1328973
>
> [SRU Justification]
>
> [Setup]
>   - 2 or more QEMU Guest VMs sharing the the same host
>   - the Guests are communicating using virtio-net-pci devices
>   - vhost-net is enabled
>
> [Explanation]
> If one Guest VM sends GSO packets to another while GRO is disabled for receiver,
> so these packets are segmented by net/core.
> In this case, if zero-copy is enabled in vhost-net, the GSO packets TX
> completion is reported to userspace as before the TX is actually done.
> The vhost-net's zero-copy mechanism is enabled by default since v3.8-rc1
> (f9611c43).
>
> [Impact]
> Incorrect/junk data sent in case the transmitting Guest OS re-uses/frees the TX
> buffer immediately upon TX completion.
>
> [Test Case]
> Windows 2008R2 Guest VMs running MS HCK Offload LSO test.
> NOTE1: GRO is always disabled in this case because it's not supported by Windows
> Guest virtio-net-pci drivers.
> NOTE2: MS HCK re-uses the GSO (LSO) buffers, so it reproduces the issue every
> time.
>
> [Note]
> This bug has been fixed in v3.14-rc7 by 1fd819ec.
> The fix actually disables zero copy for this case since it forces unconditional
> fragments copying, but it resolves the issue mentioned.
>
> OriginalAuthor: Igor Royzis <igorr at swortex.com>
> Signed-off-by: Igor Royzis <igorr at swortex.com>
> Signed-off-by: Anton Nayshtut <anton at swortex.com>
> ---
>   include/linux/skbuff.h |    1 +
>   net/core/skbuff.c      |   13 +++++++++++++
>   2 files changed, 14 insertions(+)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index e27c8f6..4848180 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -267,6 +267,7 @@ struct skb_shared_info {
>   	struct sk_buff	*frag_list;
>   	struct skb_shared_hwtstamps hwtstamps;
>   	__be32          ip6_frag_id;
> +	struct sk_buff	*zcopy_src;
>
>   	/*
>   	 * Warning : all fields before dataref are cleared in __alloc_skb()
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index b2cd9a4..7023979 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -625,14 +625,18 @@ EXPORT_SYMBOL(__kfree_skb);
>    */
>   void kfree_skb(struct sk_buff *skb)
>   {
> +	struct sk_buff *zcopy_src;
>   	if (unlikely(!skb))
>   		return;
>   	if (likely(atomic_read(&skb->users) == 1))
>   		smp_rmb();
>   	else if (likely(!atomic_dec_and_test(&skb->users)))
>   		return;
> +	zcopy_src = skb_shinfo(skb)->zcopy_src;
>   	trace_kfree_skb(skb, __builtin_return_address(0));
>   	__kfree_skb(skb);
> +	if (unlikely(zcopy_src))
> +		kfree_skb(zcopy_src);
>   }
>   EXPORT_SYMBOL(kfree_skb);
>
> @@ -677,14 +681,18 @@ EXPORT_SYMBOL(skb_tx_error);
>    */
>   void consume_skb(struct sk_buff *skb)
>   {
> +	struct sk_buff *zcopy_src;
>   	if (unlikely(!skb))
>   		return;
>   	if (likely(atomic_read(&skb->users) == 1))
>   		smp_rmb();
>   	else if (likely(!atomic_dec_and_test(&skb->users)))
>   		return;
> +	zcopy_src = skb_shinfo(skb)->zcopy_src;
>   	trace_consume_skb(skb);
>   	__kfree_skb(skb);
> +	if (unlikely(zcopy_src))
> +		kfree_skb(zcopy_src);
>   }
>   EXPORT_SYMBOL(consume_skb);
>
> @@ -2847,6 +2855,11 @@ struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features)
>   						 nskb->data - tnl_hlen,
>   						 doffset + tnl_hlen);
>
> +		if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
> +			skb_shinfo(nskb)->zcopy_src = skb;
> +			atomic_inc(&skb->users);
> +		}
> +
>   		if (fskb != skb_shinfo(skb)->frag_list)
>   			goto perform_csum_check;
>
>

If 1fd819ec fixes the problem, then why not do a proper backport and run 
the resulting patch by the stable maintainers for all affected kernels ?

rtg
-- 
Tim Gardner tim.gardner at canonical.com