NACK/Cmnt: [SRU][F:linux-bluefield][PATCH 0/9] CT offload fixes

Stefan Bader stefan.bader at canonical.com
Mon Apr 12 09:19:46 UTC 2021


On 12.04.21 10:16, Roi Dayan wrote:
> 
> 
> On 2021-04-12 10:28 AM, Stefan Bader wrote:
>> On 11.04.21 12:20, Roi Dayan wrote:
>>>
>>>
>>> On 2021-04-09 4:05 PM, Stefan Bader wrote:
>>>> On 06.04.21 19:52, Roi Dayan wrote:
>>>>> BugLink: 
>>>>> https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1922682
>>>>> BugLink: 
>>>>> https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1922678
>>>>> BugLink: 
>>>>> https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1922672
>>>>>
>>>>>
>>>> Hi Roi,
>>>>
>>>> since the 3 issues appear to be independent of each other, this should be 3 
>>>> individual submissions to the mailing list. For multiple reasons: They can 
>>>> then be treated as units of their own. That makes each piece smaller (to 
>>>> look at) and also allows to potentially handle in parallel. But also 
>>>> prevents one part that maybe rises questions blocking the other parts.
>>>>
>>>> The SRU justification has to be in each bug report (best in the description 
>>>> because that place is prominently at the top and also can be adjusted / 
>>>> corrected at any time). The content of each section is best not too 
>>>> technical. It is basically the place where program managers, developers, and 
>>>> also testers look at. The "regression potential" part (here "what (it) could 
>>>> break" is something that should be written with user / tester experience in 
>>>> mind. So which areas or activities might be affected. The current statements 
>>>> actually look to be good that way. I just mention it again since that is 
>>>> something that is misunderstood often.
>>>>
>>>> With the SRU justification in the bug report, the cover email does not 
>>>> necessarily have to repeat that (but it can). The cover email can be used to 
>>>> pass on helpful information for developers looking at the patches. That part 
>>>> is usually read only by technical people.
>>>>
>>>> A final note on the BugLink URLs: We prefer the short version of
>>>>
>>>> https://bugs.launchpad.net/bugs/<bugnumber>
>>>>
>>>> because this is the form that will always be valid, even if the 
>>>> project/package name changes. Not that this is likely for this kernel but 
>>>> then at least things look the same across all projects. I heard that this 
>>>> might be a mail servier config issue. So if that is the only thing that is 
>>>> not quite right, we would adjust that part when applying the patches. But we 
>>>> would appreciate not to have to for sure. :)
>>>>
>>>> So once the bug reports are updated, please send the individual pieces again.
>>>> Thanks!
>>>>
>>>> -Stefan
>>>
>>> great. thanks for the comments.  i'll resubmit.
>>> is a cover letter still needed if its a single patch?
>>> could the SRU justification comments be inside the commit msg after the
>>> patch separator? (---) ?
>>
>> For single patches it would be ok the put additional info between separator 
>> and the patch as that goes away when applying. Funnily it seems by splitting 
>> the submission into 3, the patches magically multiplied in numbers...
>>
>> -Stefan
>>
> 
> what do you mean magically multiplied? just 2 more cover letters.
> the single series had 9 patches + a cover letter.
> now series1 is 2 patches, series2 is 2 patches, series3 is 5 patches.
> it's still the original 9 patches.

Sorry apparently I cannot look right this morning and my memory is distorted. 
For some reason my memory had 2 reports with 1 patch each and another with 2. 
But reality was different. I think I confused it with another submission and did 
not double check.

-Stefan

> 
>>>
>>>
>>>>> SRU Justification:
>>>>>
>>>>> 1. The first 2 patches are fixing a race, potentially crashing the kernel.
>>>>> 2. The next 2 patches are fixing a possible memory hog and aging active
>>>>>     ct conns.
>>>>> 3. The last patches are adding offload support for ct_state invalid and
>>>>>     ct_state reply.
>>>>>
>>>>> * Explain the bug(s)
>>>>>
>>>>> 1. The kernel crash can happen on stress tcp traffic opening and closing
>>>>>     the conns fast.
>>>>>
>>>>> 2. The memory hog and aging active ct conns can happen from any stress test
>>>>>     as we have a single workqueue for handling the ct offload conns
>>>>>     for add/del/stats.
>>>>>
>>>>> * brief explanation of fixes
>>>>>
>>>>> The fix for #1 is setting the offload timeout early and not relying on gc.
>>>>>
>>>>> The fix for #2 is splitting add/del/stats for diff workqueue and also
>>>>> we set a limit for add work entries.
>>>>>
>>>>> * How to test
>>>>>
>>>>> Testing #1 was done with stress http traffic opening conns, short data, 
>>>>> close conns.
>>>>> different 5-tuple each time.
>>>>>
>>>>> Testing #2 was done with just stress traffic with lots of conns different 
>>>>> 5-tuple.
>>>>>
>>>>> * What it could break.
>>>>>
>>>>> Issue #1 could potentially crash the kernel.
>>>>>
>>>>> Issue #2 can take a lot of memory for a long time and also causing active 
>>>>> conns to
>>>>> age out when not necessary.
>>>>>
>>>>>
>>>>>   include/linux/skbuff.h                |  5 ++-
>>>>>   include/net/flow_offload.h            |  1 +
>>>>>   include/net/netfilter/nf_conntrack.h  | 12 ++++++
>>>>>   include/net/sch_generic.h             |  1 +
>>>>>   include/uapi/linux/pkt_cls.h          |  2 +
>>>>>   net/core/dev.c                        |  2 +
>>>>>   net/core/flow_dissector.c             | 13 +++++--
>>>>>   net/netfilter/nf_conntrack_core.c     | 12 ------
>>>>>   net/netfilter/nf_flow_table_core.c    |  2 +
>>>>>   net/netfilter/nf_flow_table_offload.c | 56 +++++++++++++++++++++++----
>>>>>   net/openvswitch/conntrack.c           |  8 ++--
>>>>>   net/openvswitch/conntrack.h           |  6 ++-
>>>>>   net/openvswitch/flow.c                |  4 +-
>>>>>   net/sched/act_ct.c                    |  6 ++-
>>>>>   net/sched/cls_api.c                   |  1 +
>>>>>   net/sched/cls_flower.c                | 10 +++--
>>>>>   16 files changed, 105 insertions(+), 36 deletions(-)
>>>>>
>>>>
>>>>
>>
>>


-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20210412/b6eeb599/attachment.sig>


More information about the kernel-team mailing list