NACK: [SRU X/C] UBUNTU: SAUCE: ipv6: frags: fix skb extraction in ip6_expire_frag_queue()
Stefan Bader
stefan.bader at canonical.com
Thu Jun 6 10:12:29 UTC 2019
On 06.06.19 12:02, Andrea Righi wrote:
> On Thu, Jun 06, 2019 at 11:31:51AM +0200, Stefan Bader wrote:
>> BugLink: https://bugs.launchpad.net/bugs/1824687
>>
>> The backport of 05c0b86b9696802fd0ce5676a92a63f1b455bdf3 upstream
>
> To be extremely picky here I'd use the standard git commit description
> style, something like:
>
> The backport of commit 05c0b86b9696 ("ipv6: frags: rewrite
> ip6_expire_frag_queue()") upstream in linux-4.4.y ...
Meh ok, since I have to re-do for below...
>
>> in linux-4.4.y stable changed ip6_expire_frag_queue() to be similar
>> to ip_expire(). However, using skb_get() leads to a crash while
>> sending the ICMP message due to a check for shared SKBs.
>>
>> kernel BUG at linux-4.4.0/net/core/skbuff.c:1207!
>> RIP: 0010:[<ffffffff81740953>]
>> [<ffffffff81740953>] pskb_expand_head+0x243/0x250
>> [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350
>> [<ffffffff8183939a>] _decode_session6+0x26a/0x400
>> [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50
>> [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0
>> [<ffffffff81824421>] icmp6_send+0x5e1/0x940
>> [<ffffffff8183d431>] icmpv6_send+0x21/0x30
>> [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120
>>
>> For IPv4 the ip_expire() function however did change considerably
>> since then. In
>>
>> fa0f527358 "ip: use rb trees for IP frag queue."
>>
>> the SKB might be taken from a rbtree (use of rbtrees for IPv4 was
>> backported to 4.4.y upstream).
>> Along with those obvious changes, the code also is modified to
>> actually de-queue the SKB from whichever source it was taken.
>> This also got rid of the skb_get() which causes problems in
>> icmpv6_send(). And latest upstream code uses inet_frag_pull_head()
>> which does the same.
>>
>> To fix the crash in IPv6, we use the same modifications added
>> to ip_expire() by fa0f527358. This might be too much change for
>> now because IPv6 only starts using rbtrees for frags with
>>
>> 997dd96471 "net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c"
>>
>> which has not been backported to 4.4.y. Testing by a reporter was
>> showing good results. Likely the else part never gets used until
>> 997dd96471 is backported, too. And that needs more changes.
>> Some upstream (stable) discussion was started but has not yet
>> resulted in any usable results. So adding this as SAUCE for now
>> to get the kernel stable (based on testing).
>>
>> Fixes: bf8187348f "ipv6: frags: rewrite ip6_expire_frag_queue()"
>> in the linux-4.4.y stable tree.
>> (based-on: f78a3f45e7 "ip: use rb trees for IP frag queue." 4.4.y)
>> Signed-off-by: Stefan Bader <stefan.bader at canonical.com>
>> ---
>> net/ipv6/reassembly.c | 21 ++++++++++++++++++---
>> 1 file changed, 18 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
>> index ec917f58d105..b1949b37ac8c 100644
>> --- a/net/ipv6/reassembly.c
>> +++ b/net/ipv6/reassembly.c
>> @@ -110,16 +110,31 @@ void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq)
>> IP6_INC_STATS_BH(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT);
>>
>> /* Don't send error if the first segment did not arrive. */
>> - head = fq->q.fragments;
>> - if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !head)
>> + if (!(fq->q.flags & INET_FRAG_FIRST_IN))
>> goto out;
>>
>> + if (fq->q.fragments) {
>> + head = fq->q.fragments;
>> + fq->q.fragments = head->next;
>> + } else {
>> + head = skb_rb_first(&fq->q.rb_fragments);
>> + if (!head)
>> + goto out;
>> + rb_erase(&head->rbnode, &fq->q.rb_fragments);
>> + memset(&head->rbnode, 0, sizeof(head->rbnode));
>> + barrier();
>> + }
>> +
>> + if (head == fq->q.fragments_tail)
>> + fq->q.fragments_tail = NULL;
>> +
>> + sub_frag_mem_limit(fq->q.net, head->truesize);
>> +
>> /* But use as source device on which LAST ARRIVED
>> * segment was received. And do not use fq->dev
>> * pointer directly, device might already disappeared.
>> */
>> head->dev = dev;
>> - skb_get(head);
>> spin_unlock(&fq->q.lock);
>>
>> icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0);
>
> We can't see from the patch, but, looking at the code, fa0f527358 is
> also moving a kfree_skb(head) after rcu_read_unlock() in ip_expire(). I
> was wondering if we should do the same here... in general it looks more
> safe to kfree() outside the RCU section, but I don't know if it's
> actually needed here.
>
> Something like the following (on top of your patch). What do you think?
I did not consider the rcu section. But makes sense, though I then also have to
init head to NULL.
>
> net/ipv6/reassembly.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
> index b1949b37ac8c..c0e2ff764e3f 100644
> --- a/net/ipv6/reassembly.c
> +++ b/net/ipv6/reassembly.c
> @@ -138,13 +138,14 @@ void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq)
> spin_unlock(&fq->q.lock);
>
> icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0);
> - kfree_skb(head);
> goto out_rcu_unlock;
>
> out:
> spin_unlock(&fq->q.lock);
> out_rcu_unlock:
> rcu_read_unlock();
> + if (head)
> + kfree_skb(head);
> inet_frag_put(&fq->q);
> }
> EXPORT_SYMBOL(ip6_expire_frag_queue);
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20190606/d1f294dd/attachment-0001.sig>
More information about the kernel-team
mailing list