[Precise] [PATCH] SRU: zerocopy LSO segmenting fixed

Anton Nayshtut anton at swortex.com
Wed Jun 18 14:37:38 UTC 2014


Looping Gema.

I'm submitting this patch to 3.11 upon Gema's request.

Best Regards,
Anton
On Jun 18, 2014 5:31 PM, "Tim Gardner" <tim.gardner at canonical.com> wrote:

> You should stay focused on solutions for the Precise kernel (3.2) and
> the Trusty LTS kernel (3.13). All others are EOL by early August with
> the release of 14.04.1 and 12.04.5.
>
> rtg
>
> On 06/18/2014 08:23 AM, Anton Nayshtut wrote:
> > Hi Tim,
> >
> > At the moment, I'm testing a backport of 1fd819ec to 3.11 and it looks
> > good so far.
> >
> > I'll send it to this list once all the tests pass.
> >
> > Best Regards,
> > Anton
> >
> > On Wed, Jun 18, 2014 at 5:18 PM, Tim Gardner <tim.gardner at canonical.com>
> wrote:
> >> Perhaps this works for your particular workload, but there is no way I'm
> >> applying some random crack to core networking code without it having
> >> been vetted by upstream. Maybe there is a reason zero-copy is disabled
> >> by 1fd819ec ?
> >>
> >> rtg
> >>
> >> On 06/16/2014 08:04 AM, Anton Nayshtut wrote:
> >>> Hi Tim,
> >>>
> >>> Certainly, backporting of 1fd819ec is an option.
> >>>
> >>> The reason behind sending our patch is that 1fd819ec fixes the issue
> >>> by eliminating the zero-copy advantages for this case.
> >>>
> >>> According to our tests, our patch performs ~24% better than 1fd819ec
> >>> in this scenario. See https://lkml.org/lkml/2014/5/25/22.
> >>>
> >>> Best Regards.
> >>> Anton
> >>>
> >>> On Mon, Jun 16, 2014 at 3:50 PM, Tim Gardner <
> tim.gardner at canonical.com> wrote:
> >>>> On 06/15/2014 03:15 AM, Anton Nayshtut wrote:
> >>>>>
> >>>>> BugLink: https://bugs.launchpad.net/bugs/1328973
> >>>>>
> >>>>> [SRU Justification]
> >>>>>
> >>>>> [Setup]
> >>>>>   - 2 or more QEMU Guest VMs sharing the the same host
> >>>>>   - the Guests are communicating using virtio-net-pci devices
> >>>>>   - vhost-net is enabled
> >>>>>
> >>>>> [Explanation]
> >>>>> If one Guest VM sends GSO packets to another while GRO is disabled
> for
> >>>>> receiver,
> >>>>> so these packets are segmented by net/core.
> >>>>> In this case, if zero-copy is enabled in vhost-net, the GSO packets
> TX
> >>>>> completion is reported to userspace as before the TX is actually
> done.
> >>>>> The vhost-net's zero-copy mechanism is enabled by default since
> v3.8-rc1
> >>>>> (f9611c43).
> >>>>>
> >>>>> [Impact]
> >>>>> Incorrect/junk data sent in case the transmitting Guest OS
> re-uses/frees
> >>>>> the TX
> >>>>> buffer immediately upon TX completion.
> >>>>>
> >>>>> [Test Case]
> >>>>> Windows 2008R2 Guest VMs running MS HCK Offload LSO test.
> >>>>> NOTE1: GRO is always disabled in this case because it's not
> supported by
> >>>>> Windows
> >>>>> Guest virtio-net-pci drivers.
> >>>>> NOTE2: MS HCK re-uses the GSO (LSO) buffers, so it reproduces the
> issue
> >>>>> every
> >>>>> time.
> >>>>>
> >>>>> [Note]
> >>>>> This bug has been fixed in v3.14-rc7 by 1fd819ec.
> >>>>> The fix actually disables zero copy for this case since it forces
> >>>>> unconditional
> >>>>> fragments copying, but it resolves the issue mentioned.
> >>>>>
> >>>>> OriginalAuthor: Igor Royzis <igorr at swortex.com>
> >>>>> Signed-off-by: Igor Royzis <igorr at swortex.com>
> >>>>> Signed-off-by: Anton Nayshtut <anton at swortex.com>
> >>>>> ---
> >>>>>   include/linux/skbuff.h |    1 +
> >>>>>   net/core/skbuff.c      |   13 +++++++++++++
> >>>>>   2 files changed, 14 insertions(+)
> >>>>>
> >>>>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> >>>>> index e27c8f6..4848180 100644
> >>>>> --- a/include/linux/skbuff.h
> >>>>> +++ b/include/linux/skbuff.h
> >>>>> @@ -267,6 +267,7 @@ struct skb_shared_info {
> >>>>>         struct sk_buff  *frag_list;
> >>>>>         struct skb_shared_hwtstamps hwtstamps;
> >>>>>         __be32          ip6_frag_id;
> >>>>> +       struct sk_buff  *zcopy_src;
> >>>>>
> >>>>>         /*
> >>>>>          * Warning : all fields before dataref are cleared in
> >>>>> __alloc_skb()
> >>>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >>>>> index b2cd9a4..7023979 100644
> >>>>> --- a/net/core/skbuff.c
> >>>>> +++ b/net/core/skbuff.c
> >>>>> @@ -625,14 +625,18 @@ EXPORT_SYMBOL(__kfree_skb);
> >>>>>    */
> >>>>>   void kfree_skb(struct sk_buff *skb)
> >>>>>   {
> >>>>> +       struct sk_buff *zcopy_src;
> >>>>>         if (unlikely(!skb))
> >>>>>                 return;
> >>>>>         if (likely(atomic_read(&skb->users) == 1))
> >>>>>                 smp_rmb();
> >>>>>         else if (likely(!atomic_dec_and_test(&skb->users)))
> >>>>>                 return;
> >>>>> +       zcopy_src = skb_shinfo(skb)->zcopy_src;
> >>>>>         trace_kfree_skb(skb, __builtin_return_address(0));
> >>>>>         __kfree_skb(skb);
> >>>>> +       if (unlikely(zcopy_src))
> >>>>> +               kfree_skb(zcopy_src);
> >>>>>   }
> >>>>>   EXPORT_SYMBOL(kfree_skb);
> >>>>>
> >>>>> @@ -677,14 +681,18 @@ EXPORT_SYMBOL(skb_tx_error);
> >>>>>    */
> >>>>>   void consume_skb(struct sk_buff *skb)
> >>>>>   {
> >>>>> +       struct sk_buff *zcopy_src;
> >>>>>         if (unlikely(!skb))
> >>>>>                 return;
> >>>>>         if (likely(atomic_read(&skb->users) == 1))
> >>>>>                 smp_rmb();
> >>>>>         else if (likely(!atomic_dec_and_test(&skb->users)))
> >>>>>                 return;
> >>>>> +       zcopy_src = skb_shinfo(skb)->zcopy_src;
> >>>>>         trace_consume_skb(skb);
> >>>>>         __kfree_skb(skb);
> >>>>> +       if (unlikely(zcopy_src))
> >>>>> +               kfree_skb(zcopy_src);
> >>>>>   }
> >>>>>   EXPORT_SYMBOL(consume_skb);
> >>>>>
> >>>>> @@ -2847,6 +2855,11 @@ struct sk_buff *skb_segment(struct sk_buff
> *skb,
> >>>>> netdev_features_t features)
> >>>>>                                                  nskb->data -
> tnl_hlen,
> >>>>>                                                  doffset + tnl_hlen);
> >>>>>
> >>>>> +               if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
> >>>>> +                       skb_shinfo(nskb)->zcopy_src = skb;
> >>>>> +                       atomic_inc(&skb->users);
> >>>>> +               }
> >>>>> +
> >>>>>                 if (fskb != skb_shinfo(skb)->frag_list)
> >>>>>                         goto perform_csum_check;
> >>>>>
> >>>>>
> >>>>
> >>>> If 1fd819ec fixes the problem, then why not do a proper backport and
> run the
> >>>> resulting patch by the stable maintainers for all affected kernels ?
> >>>>
> >>>> rtg
> >>>> --
> >>>> Tim Gardner tim.gardner at canonical.com
> >>
> >>
> >> --
> >> Tim Gardner tim.gardner at canonical.com
>
>
> --
> Tim Gardner tim.gardner at canonical.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20140618/f07a184f/attachment.html>


More information about the kernel-team mailing list