ACK: [SRU V2 X/C] UBUNTU: SAUCE: ipv6: frags: fix skb extraction in ip6_expire_frag_queue()

Colin Ian King colin.king at canonical.com
Thu Jun 6 10:35:29 UTC 2019


On 06/06/2019 11:24, Stefan Bader wrote:
> BugLink: https://bugs.launchpad.net/bugs/1824687
> 
> The backport of
> 
>   05c0b86b96 ("ipv6: frags: rewrite ip6_expire_frag_queue()")
> 
> to linux-4.4.y stable changed ip6_expire_frag_queue() to be similar
> to ip_expire(). However, using skb_get() leads to a crash while
> sending the ICMP message due to a check for shared SKBs.
> 
>    kernel BUG at linux-4.4.0/net/core/skbuff.c:1207!
>    RIP: 0010:[<ffffffff81740953>]
>     [<ffffffff81740953>] pskb_expand_head+0x243/0x250
>     [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350
>     [<ffffffff8183939a>] _decode_session6+0x26a/0x400
>     [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50
>     [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0
>     [<ffffffff81824421>] icmp6_send+0x5e1/0x940
>     [<ffffffff8183d431>] icmpv6_send+0x21/0x30
>     [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120
> 
> For IPv4 the ip_expire() function however did change considerably
> since then. In
> 
>   fa0f527358 ("ip: use rb trees for IP frag queue.")
> 
> the SKB might be taken from a rbtree (use of rbtrees for IPv4 was
> backported to 4.4.y upstream).
> Along with those obvious changes, the code also is modified to
> actually de-queue the SKB from whichever source it was taken.
> This also got rid of the skb_get() which causes problems in
> icmpv6_send(). And latest upstream code uses inet_frag_pull_head()
> which does the same.
> 
> To fix the crash in IPv6, we use the same modifications added
> to ip_expire() by fa0f527358. This might be too much change for
> now because IPv6 only starts using rbtrees for frags with
> 
>   997dd96471 ("net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c")
> 
> which has not been backported to 4.4.y. Testing by a reporter was
> showing good results. Likely the else part never gets used until
> 997dd96471 is backported, too. And that needs more changes.
> Some upstream (stable) discussion was started but has not yet
> resulted in any usable results. So adding this as SAUCE for now
> to get the kernel stable (based on testing).
> 
> Fixes: bf8187348f ("ipv6: frags: rewrite ip6_expire_frag_queue()")
>        in the linux-4.4.y stable tree.
> (based-on: f78a3f45e7 ("ip: use rb trees for IP frag queue." 4.4.y))
> Signed-off-by: Stefan Bader <stefan.bader at canonical.com>
> ---
>  net/ipv6/reassembly.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
> index ec917f58d105..5b09ce54c476 100644
> --- a/net/ipv6/reassembly.c
> +++ b/net/ipv6/reassembly.c
> @@ -92,7 +92,7 @@ EXPORT_SYMBOL(ip6_frag_init);
>  void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq)
>  {
>  	struct net_device *dev = NULL;
> -	struct sk_buff *head;
> +	struct sk_buff *head = NULL;
>  
>  	rcu_read_lock();
>  	spin_lock(&fq->q.lock);
> @@ -110,26 +110,42 @@ void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq)
>  	IP6_INC_STATS_BH(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT);
>  
>  	/* Don't send error if the first segment did not arrive. */
> -	head = fq->q.fragments;
> -	if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !head)
> +	if (!(fq->q.flags & INET_FRAG_FIRST_IN))
>  		goto out;
>  
> +	if (fq->q.fragments) {
> +		head = fq->q.fragments;
> +		fq->q.fragments = head->next;
> +	} else {
> +		head = skb_rb_first(&fq->q.rb_fragments);
> +		if (!head)
> +			goto out;
> +		rb_erase(&head->rbnode, &fq->q.rb_fragments);
> +		memset(&head->rbnode, 0, sizeof(head->rbnode));
> +		barrier();
> +	}
> +
> +	if (head == fq->q.fragments_tail)
> +		fq->q.fragments_tail = NULL;
> +
> +	sub_frag_mem_limit(fq->q.net, head->truesize);
> +
>  	/* But use as source device on which LAST ARRIVED
>  	 * segment was received. And do not use fq->dev
>  	 * pointer directly, device might already disappeared.
>  	 */
>  	head->dev = dev;
> -	skb_get(head);
>  	spin_unlock(&fq->q.lock);
>  
>  	icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0);
> -	kfree_skb(head);
>  	goto out_rcu_unlock;
>  
>  out:
>  	spin_unlock(&fq->q.lock);
>  out_rcu_unlock:
>  	rcu_read_unlock();
> +	if (head)
> +		kfree_skb(head);
>  	inet_frag_put(&fq->q);
>  }
>  EXPORT_SYMBOL(ip6_expire_frag_queue);
> 

this now closer matches the intent from the commit fa0f527358bd900ef9
that it is based on.  The original had positive test results, so I'm
hoping this will be OK too (the delta is only minor) and addresses the
concerns in Andrea's comments.

So, based on this,

Acked-by: Colin Ian King <colin.king at canonical.com>



More information about the kernel-team mailing list