NACK/Cmnt: [SRU][F:linux-bluefield][PATCH 0/9] CT offload fixes

Roi Dayan roid at nvidia.com
Sun Apr 11 10:20:40 UTC 2021



On 2021-04-09 4:05 PM, Stefan Bader wrote:
> On 06.04.21 19:52, Roi Dayan wrote:
>> BugLink: 
>> https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1922682
>> BugLink: 
>> https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1922678
>> BugLink: 
>> https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1922672
>>
>>
> Hi Roi,
> 
> since the 3 issues appear to be independent of each other, this should 
> be 3 individual submissions to the mailing list. For multiple reasons: 
> They can then be treated as units of their own. That makes each piece 
> smaller (to look at) and also allows to potentially handle in parallel. 
> But also prevents one part that maybe rises questions blocking the other 
> parts.
> 
> The SRU justification has to be in each bug report (best in the 
> description because that place is prominently at the top and also can be 
> adjusted / corrected at any time). The content of each section is best 
> not too technical. It is basically the place where program managers, 
> developers, and also testers look at. The "regression potential" part 
> (here "what (it) could break" is something that should be written with 
> user / tester experience in mind. So which areas or activities might be 
> affected. The current statements actually look to be good that way. I 
> just mention it again since that is something that is misunderstood often.
> 
> With the SRU justification in the bug report, the cover email does not 
> necessarily have to repeat that (but it can). The cover email can be 
> used to pass on helpful information for developers looking at the 
> patches. That part is usually read only by technical people.
> 
> A final note on the BugLink URLs: We prefer the short version of
> 
> https://bugs.launchpad.net/bugs/<bugnumber>
> 
> because this is the form that will always be valid, even if the 
> project/package name changes. Not that this is likely for this kernel 
> but then at least things look the same across all projects. I heard that 
> this might be a mail servier config issue. So if that is the only thing 
> that is not quite right, we would adjust that part when applying the 
> patches. But we would appreciate not to have to for sure. :)
> 
> So once the bug reports are updated, please send the individual pieces 
> again.
> Thanks!
> 
> -Stefan

great. thanks for the comments.  i'll resubmit.
is a cover letter still needed if its a single patch?
could the SRU justification comments be inside the commit msg after the
patch separator? (---) ?


>> SRU Justification:
>>
>> 1. The first 2 patches are fixing a race, potentially crashing the 
>> kernel.
>> 2. The next 2 patches are fixing a possible memory hog and aging active
>>     ct conns.
>> 3. The last patches are adding offload support for ct_state invalid and
>>     ct_state reply.
>>
>> * Explain the bug(s)
>>
>> 1. The kernel crash can happen on stress tcp traffic opening and closing
>>     the conns fast.
>>
>> 2. The memory hog and aging active ct conns can happen from any stress 
>> test
>>     as we have a single workqueue for handling the ct offload conns
>>     for add/del/stats.
>>
>> * brief explanation of fixes
>>
>> The fix for #1 is setting the offload timeout early and not relying on 
>> gc.
>>
>> The fix for #2 is splitting add/del/stats for diff workqueue and also
>> we set a limit for add work entries.
>>
>> * How to test
>>
>> Testing #1 was done with stress http traffic opening conns, short 
>> data, close conns.
>> different 5-tuple each time.
>>
>> Testing #2 was done with just stress traffic with lots of conns 
>> different 5-tuple.
>>
>> * What it could break.
>>
>> Issue #1 could potentially crash the kernel.
>>
>> Issue #2 can take a lot of memory for a long time and also causing 
>> active conns to
>> age out when not necessary.
>>
>>
>>   include/linux/skbuff.h                |  5 ++-
>>   include/net/flow_offload.h            |  1 +
>>   include/net/netfilter/nf_conntrack.h  | 12 ++++++
>>   include/net/sch_generic.h             |  1 +
>>   include/uapi/linux/pkt_cls.h          |  2 +
>>   net/core/dev.c                        |  2 +
>>   net/core/flow_dissector.c             | 13 +++++--
>>   net/netfilter/nf_conntrack_core.c     | 12 ------
>>   net/netfilter/nf_flow_table_core.c    |  2 +
>>   net/netfilter/nf_flow_table_offload.c | 56 +++++++++++++++++++++++----
>>   net/openvswitch/conntrack.c           |  8 ++--
>>   net/openvswitch/conntrack.h           |  6 ++-
>>   net/openvswitch/flow.c                |  4 +-
>>   net/sched/act_ct.c                    |  6 ++-
>>   net/sched/cls_api.c                   |  1 +
>>   net/sched/cls_flower.c                | 10 +++--
>>   16 files changed, 105 insertions(+), 36 deletions(-)
>>
> 
> 



More information about the kernel-team mailing list