[SRU][Trusty/Utopic/Vivid/Wily][PATCH 0/3] Fixes for LP:#1527643

Gavin Guo gavin.guo at canonical.com
Sat Mar 12 14:09:08 UTC 2016


BugLink: https://bugs.launchpad.net/bugs/1527643

[Impact]

The use-after-free invalid read bug, which happens in really tricky case,
would use the numa_faults data already freed for the NUMA balance to make a
decision to migrate the exiting process.

The bug was found by the Ubuntu-3.13.0-65 with KASan backported.
binary package:
http://kernel.ubuntu.com/~gavinguo/kasan/Ubuntu-3.13.0-65.105/

source code:
http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=Ubuntu-3.13.0-65-kasan

==================================================================
BUG: KASan: use after free in task_numa_find_cpu+0x64c/0x890 at addr ffff880dd393ecd8
Read of size 8 by task qemu-system-x86/3998900
=============================================================================
BUG kmalloc-128 (Tainted: G B ): kasan: bad access detected
-----------------------------------------------------------------------------

INFO: Allocated in task_numa_fault+0xc1b/0xed0 age=41980 cpu=18 pid=3998890
        __slab_alloc+0x4f8/0x560
        __kmalloc+0x1eb/0x280
        task_numa_fault+0xc1b/0xed0
        do_numa_page+0x192/0x200
        handle_mm_fault+0x808/0x1160
        __do_page_fault+0x218/0x750
        do_page_fault+0x1a/0x70
        page_fault+0x28/0x30
        SyS_poll+0x66/0x1a0
        system_call_fastpath+0x1a/0x1f
INFO: Freed in task_numa_free+0x1d2/0x200 age=62 cpu=18 pid=0
        __slab_free+0x2ab/0x3f0
        kfree+0x161/0x170
        task_numa_free+0x1d2/0x200
        finish_task_switch+0x1d2/0x210
        __schedule+0x5d4/0xc60
        schedule_preempt_disabled+0x40/0xc0
        cpu_startup_entry+0x2da/0x340
        start_secondary+0x28f/0x360
INFO: Slab 0xffffea00374e4f00 objects=37 used=17 fp=0xffff880dd393ecb0 flags=0x6ffff0000004080
INFO: Object 0xffff880dd393ecb0 @offset=11440 fp=0xffff880dd393f700

Bytes b4 ffff880dd393eca0: 0c 00 00 00 18 00 00 00 af 63 3a 04 01 00 00 00 .........c:.....
Object ffff880dd393ecb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff880dd393ecc0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff880dd393ecd0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff880dd393ece0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff880dd393ecf0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff880dd393ed00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff880dd393ed10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff880dd393ed20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
CPU: 61 PID: 3998900 Comm: qemu-system-x86 Tainted: G B 3.13.0-65-generic #105
Hardware name: Supermicro X8QB6/X8QB6, BIOS 2.0c 06/11/2
 ffffea00374e4f00 ffff8816c572b420 ffffffff81a6ce35 ffff88045f00f500
 ffff8816c572b450 ffffffff81244aed ffff88045f00f500 ffffea00374e4f00
 ffff880dd393ecb0 0000000000000012 ffff8816c572b478 ffffffff8124ac36
Call Trace:
 [<ffffffff81a6ce35>] dump_stack+0x45/0x56
 [<ffffffff81244aed>] print_trailer+0xfd/0x170
 [<ffffffff8124ac36>] object_err+0x36/0x40
 [<ffffffff8124cbf9>] kasan_report_error+0x1e9/0x3a0
 [<ffffffff8124d260>] kasan_report+0x40/0x50
 [<ffffffff810dda7c>] ? task_numa_find_cpu+0x64c/0x890
 [<ffffffff8124bee9>] __asan_load8+0x69/0xa0
 [<ffffffff814f5c38>] ? find_next_bit+0xd8/0x120
 [<ffffffff810dda7c>] task_numa_find_cpu+0x64c/0x890
 [<ffffffff810de16c>] task_numa_migrate+0x4ac/0x7b0
 [<ffffffff810de523>] numa_migrate_preferred+0xb3/0xc0
 [<ffffffff810e0b88>] task_numa_fault+0xb88/0xed0
 [<ffffffff8120ef02>] do_numa_page+0x192/0x200
 [<ffffffff81211038>] handle_mm_fault+0x808/0x1160
 [<ffffffff810d7dbd>] ? sched_clock_cpu+0x10d/0x160
 [<ffffffff81068c52>] ? native_load_tls+0x82/0xa0
 [<ffffffff81a7bd68>] __do_page_fault+0x218/0x750
 [<ffffffff810c2186>] ? hrtimer_try_to_cancel+0x76/0x160
 [<ffffffff81a6f5e7>] ? schedule_hrtimeout_range_clock.part.24+0xf7/0x1c0
 [<ffffffff81a7c2ba>] do_page_fault+0x1a/0x70
 [<ffffffff81a772e8>] page_fault+0x28/0x30
 [<ffffffff8128cbd4>] ? do_sys_poll+0x1c4/0x6d0
 [<ffffffff810e64f6>] ? enqueue_task_fair+0x4b6/0xaa0
 [<ffffffff810233c9>] ? sched_clock+0x9/0x10
 [<ffffffff810cf70a>] ? resched_task+0x7a/0xc0
 [<ffffffff810d0663>] ? check_preempt_curr+0xb3/0x130
 [<ffffffff8128b5c0>] ? poll_select_copy_remaining+0x170/0x170
 [<ffffffff810d3bc0>] ? wake_up_state+0x10/0x20
 [<ffffffff8112a28f>] ? drop_futex_key_refs.isra.14+0x1f/0x90
 [<ffffffff8112d40e>] ? futex_requeue+0x3de/0xba0
 [<ffffffff8112e49e>] ? do_futex+0xbe/0x8f0
 [<ffffffff81022c89>] ? read_tsc+0x9/0x20
 [<ffffffff8111bd9d>] ? ktime_get_ts+0x12d/0x170
 [<ffffffff8108f699>] ? timespec_add_safe+0x59/0xe0
 [<ffffffff8128d1f6>] SyS_poll+0x66/0x1a0
 [<ffffffff81a830dd>] system_call_fastpath+0x1a/0x1f
Memory state around the buggy address:
 ffff880dd393eb80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff880dd393ec00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff880dd393ec80: fc fc fc fc fc fc fb fb fb fb fb fb fb fb fb fb
                                                    ^
 ffff880dd393ed00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
 ffff880dd393ed80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

--------------------------8<--------------------------
$ addr2line 0xffffffff810dda7c -e usr/lib/debug/boot/vmlinux-3.13.0-65-generic -f -i
task_numa_compare
/home/gavin/os/ubuntu-trusty-amd64/kernel/sched/fair.c:1084
task_numa_find_cpu
/home/gavin/os/ubuntu-trusty-amd64/kernel/sched/fair.c:1170

1083 if (cur->numa_group == env->p->numa_group) {
1084 imp = taskimp + task_weight(cur, env->src_nid) -
1085 task_weight(cur, env->dst_nid);

In short, this is the use-after-free bug happening on the
task_struct->numa_faults which is freed by the task_numa_free called by the
finish_task_switch when the process is exiting. While the numa balance
mechanism is triggering the do_numa_page fault and need to read the
task_struct->numa_faults to determine if the current exiting process is
needed to migrate to the other CPU for better memory access performance
because of shorter distance to access memory on the other node.

[Fix]

There are 3 patches(renamed to A, B, and C) related to the backport.
However, not all distribution need all the patches as some are already in
the newer version of kernel.

A: 156654f491dd ("sched/numa: Move task_numa_free() to
	__put_task_struct()"): included in v3.15-rc1~180^2~5.

Reason: The patch is included because the task_numa_free() should be called
	inside the __put_task_struct() since the Fix C is based on the
	get_task_struct() to avoid the task_numa_free() being called.

B: 1effd9f19324 ("sched/numa: Fix unsafe get_task_struct() in
	task_numa_assign()"): included in v3.18-rc3~21^2~5.

Reason:	Add the checking of the PF_EXITING flag to ensure the task has not
	been freed. 

C: 1dff76b92f69 ("sched/numa: Fix use-after-free bug in the
	task_numa_compare"): included in v4.5-rc2~8^2~1.

Reason: However, as the commit message in B said "rcu_read_lock()
	can't save us from the final put_task_struct() in
	finish_task_switch()" so that's the patch C solved.

For v3.13 Trusty there are 3 patches needed:
  - A, B, and C.
For v3.16 Utopic there are 2 patches needed:
  - B and C.
For v3.19 Vivid/v4.2 Wily there is 1 patch needed:
  - C. <-- clean cherry-pick.

[Test Case]

Running the reproducer for about 4 weeks with the backported Trusty kernel
cannot find the KASan error messages in the dmesg.

Reproducer:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1527643/+attachment/4595998/+files/kernel_panic_test.sh

Gavin Guo (1):
  sched/numa: Fix use-after-free bug in the task_numa_compare

Kirill Tkhai (1):
  sched/numa: Fix unsafe get_task_struct() in task_numa_assign()

Mike Galbraith (1):
  sched/numa: Move task_numa_free() to __put_task_struct()

 kernel/fork.c       |  1 +
 kernel/sched/core.c |  1 -
 kernel/sched/fair.c | 33 +++++++++++++++++++++++++++++----
 3 files changed, 30 insertions(+), 5 deletions(-)

-- 
2.0.0





More information about the kernel-team mailing list