[SRU Bionic] ip: frags: fix crash in ip_do_fragment()

Mauricio Oliveira mauricio.oliveira at canonical.com
Wed Sep 4 18:10:16 UTC 2019


On Wed, Sep 4, 2019 at 2:41 PM Thadeu Lima de Souza Cascardo
<cascardo at canonical.com> wrote:
>
> From: Taehee Yoo <ap420073 at gmail.com>
>
> BugLink: https://bugs.launchpad.net/bugs/1842447
>
> commit 5d407b071dc369c26a38398326ee2be53651cfe4 upstream
>
> A kernel crash occurrs when defragmented packet is fragmented
> in ip_do_fragment().
> In defragment routine, skb_orphan() is called and
> skb->ip_defrag_offset is set. but skb->sk and
> skb->ip_defrag_offset are same union member. so that
> frag->sk is not NULL.
> Hence crash occurrs in skb->sk check routine in ip_do_fragment() when
> defragmented packet is fragmented.
>
> test commands:
>    %iptables -t nat -I POSTROUTING -j MASQUERADE
>    %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000
>
> splat looks like:
> [  261.069429] kernel BUG at net/ipv4/ip_output.c:636!

Thanks for sending this one out!

This is probably the issue responsible for a late NAK to a bnx2x patch recently,
per the line/function, although with a different stack trace in the
forwarding path.

I'll ask that reporter to check this patch too, but response may not
happen in a timely manner.

cheers,
Mauricio

> [  261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> [  261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3
> [  261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600
> [  261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c
> [  261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202
> [  261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004
> [  261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8
> [  261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395
> [  261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4
> [  261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000
> [  261.174169] FS:  00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
> [  261.183012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0
> [  261.198158] Call Trace:
> [  261.199018]  ? dst_output+0x180/0x180
> [  261.205011]  ? save_trace+0x300/0x300
> [  261.209018]  ? ip_copy_metadata+0xb00/0xb00
> [  261.213034]  ? sched_clock_local+0xd4/0x140
> [  261.218158]  ? kill_l4proto+0x120/0x120 [nf_conntrack]
> [  261.223014]  ? rt_cpu_seq_stop+0x10/0x10
> [  261.227014]  ? find_held_lock+0x39/0x1c0
> [  261.233008]  ip_finish_output+0x51d/0xb50
> [  261.237006]  ? ip_fragment.constprop.56+0x220/0x220
> [  261.243011]  ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack]
> [  261.250152]  ? rcu_is_watching+0x77/0x120
> [  261.255010]  ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4]
> [  261.261033]  ? nf_hook_slow+0xb1/0x160
> [  261.265007]  ip_output+0x1c7/0x710
> [  261.269005]  ? ip_mc_output+0x13f0/0x13f0
> [  261.273002]  ? __local_bh_enable_ip+0xe9/0x1b0
> [  261.278152]  ? ip_fragment.constprop.56+0x220/0x220
> [  261.282996]  ? nf_hook_slow+0xb1/0x160
> [  261.287007]  raw_sendmsg+0x21f9/0x4420
> [  261.291008]  ? dst_output+0x180/0x180
> [  261.297003]  ? sched_clock_cpu+0x126/0x170
> [  261.301003]  ? find_held_lock+0x39/0x1c0
> [  261.306155]  ? stop_critical_timings+0x420/0x420
> [  261.311004]  ? check_flags.part.36+0x450/0x450
> [  261.315005]  ? _raw_spin_unlock_irq+0x29/0x40
> [  261.320995]  ? _raw_spin_unlock_irq+0x29/0x40
> [  261.326142]  ? cyc2ns_read_end+0x10/0x10
> [  261.330139]  ? raw_bind+0x280/0x280
> [  261.334138]  ? sched_clock_cpu+0x126/0x170
> [  261.338995]  ? check_flags.part.36+0x450/0x450
> [  261.342991]  ? __lock_acquire+0x4500/0x4500
> [  261.348994]  ? inet_sendmsg+0x11c/0x500
> [  261.352989]  ? dst_output+0x180/0x180
> [  261.357012]  inet_sendmsg+0x11c/0x500
> [ ... ]
>
> v2:
>  - clear skb->sk at reassembly routine.(Eric Dumarzet)
>
> Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.")
> Suggested-by: Eric Dumazet <edumazet at google.com>
> Signed-off-by: Taehee Yoo <ap420073 at gmail.com>
> Reviewed-by: Eric Dumazet <edumazet at google.com>
> Signed-off-by: David S. Miller <davem at davemloft.net>
> Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
> (backported from commit 08fb833b40e361ce927c64d40e348af96996d9eb)
>
> [cascardo:
> The backport here misses the hunk at net/ipv6/netfilter/nf_conntrack_reasm.c.
> This one has been changed by our commit (net: IP6 defrag: use rbtrees in
> nf_conntrack_reasm.c), and calls inet_frag_reasm_finish, which resets sk to
> NULL.
> ]
>
> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo at canonical.com>
> ---
>  net/ipv4/ip_fragment.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
> index 73ca84d880b9..098c6dcff305 100644
> --- a/net/ipv4/ip_fragment.c
> +++ b/net/ipv4/ip_fragment.c
> @@ -606,6 +606,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
>                         nextp = &fp->next;
>                         fp->prev = NULL;
>                         memset(&fp->rbnode, 0, sizeof(fp->rbnode));
> +                       fp->sk = NULL;
>                         head->data_len += fp->len;
>                         head->len += fp->len;
>                         if (head->ip_summed != fp->ip_summed)
> --
> 2.20.1
>
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team



-- 
Mauricio Faria de Oliveira



More information about the kernel-team mailing list