[Bug 330806] Re: (Apparently) spurious, intermittent OOM killer intervention
Przemysław Kulczycki
przemekkulczycki at gmail.com
Thu Apr 8 11:30:18 BST 2010
Also see Bug #304660
--
(Apparently) spurious, intermittent OOM killer intervention
https://bugs.launchpad.net/bugs/330806
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is a direct subscriber.
Status in “linux” package in Ubuntu: Confirmed
Status in “linux” package in CentOS: Unknown
Bug description:
Description: Ubuntu 8.04.2
Release: 8.04
Recently, the OOM killer has begun to be triggered periodically for no obvious reason. It generally kills one or two processes and then decides it is satisfied. As far as I can tell, there is no out of memory situation when this happens. I cannot correlate it to any particular activity on my part. System memory usage should be relatively stable at the times it happens. According to the OOM reports, I have plenty of swap remaining at least. I'm not sure how to extract the amount of free non-swap memory from the reports. Here is the most recent:
Feb 17 18:59:22 charm kernel: [120648.492789] ssh invoked oom-killer: gfp_mask=0x44d0, order=2, oomkilladj=0
Feb 17 18:59:22 charm kernel: [120648.492797] Pid: 8640, comm: ssh Tainted: P 2.6.24-23-generic #1
Feb 17 18:59:22 charm kernel: [120648.492825] [oom_kill_process+0x10a/0x120] oom_kill_process+0x10a/0x120
Feb 17 18:59:22 charm kernel: [120648.492914] [out_of_memory+0x167/0x1a0] out_of_memory+0x167/0x1a0
Feb 17 18:59:22 charm kernel: [120648.492957] [agpgart:__alloc_pages+0x36c/0x3a0] __alloc_pages+0x36c/0x3a0
Feb 17 18:59:22 charm kernel: [120648.492970] [__rmqueue+0x138/0x1f0] __rmqueue+0x138/0x1f0
Feb 17 18:59:22 charm kernel: [120648.493020] [ext3:__get_free_pages+0x37/0x820] __get_free_pages+0x37/0x50
Feb 17 18:59:22 charm kernel: [120648.493028] [forcedeth:__alloc_skb+0x55/0x3ae0] __alloc_skb+0x55/0x120
Feb 17 18:59:22 charm kernel: [120648.493052] [sock_alloc_send_skb+0x162/0x1b0] sock_alloc_send_skb+0x162/0x1b0
Feb 17 18:59:22 charm kernel: [120648.493098] [unix_stream_sendmsg+0x29e/0x3a0] unix_stream_sendmsg+0x29e/0x3a0
Feb 17 18:59:22 charm kernel: [120648.493111] [unix_stream_sendmsg+0x194/0x3a0] unix_stream_sendmsg+0x194/0x3a0
Feb 17 18:59:22 charm kernel: [120648.493137] [core_sys_select+0x213/0x2f0] core_sys_select+0x213/0x2f0
Feb 17 18:59:22 charm kernel: [120648.493183] [sock_aio_write+0x109/0x120] sock_aio_write+0x109/0x120
Feb 17 18:59:22 charm kernel: [120648.493205] [ip_local_deliver_finish+0xf9/0x210] ip_local_deliver_finish+0xf9/0x210
Feb 17 18:59:22 charm kernel: [120648.493271] [ext3:do_sync_write+0xd5/0x120] do_sync_write+0xd5/0x120
Feb 17 18:59:22 charm kernel: [120648.493284] [ip_rcv+0x0/0x290] ip_rcv+0x0/0x290
Feb 17 18:59:22 charm kernel: [120648.493337] [<c0140c00>] autoremove_wake_function+0x0/0x40
Feb 17 18:59:22 charm kernel: [120648.493364] [net_rx_action+0x12d/0x210] net_rx_action+0x12d/0x210
Feb 17 18:59:22 charm kernel: [120648.493428] [vfs_write+0x161/0x170] vfs_write+0x161/0x170
Feb 17 18:59:22 charm kernel: [120648.493453] [sys_write+0x41/0x70] sys_write+0x41/0x70
Feb 17 18:59:22 charm kernel: [120648.493479] [sysenter_past_esp+0x6b/0xa9] sysenter_past_esp+0x6b/0xa9
Feb 17 18:59:22 charm kernel: [120648.493547] =======================
Feb 17 18:59:22 charm kernel: [120648.493550] Mem-info:
Feb 17 18:59:22 charm kernel: [120648.493553] DMA per-cpu:
Feb 17 18:59:22 charm kernel: [120648.493556] CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Feb 17 18:59:22 charm kernel: [120648.493560] CPU 1: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Feb 17 18:59:22 charm kernel: [120648.493562] Normal per-cpu:
Feb 17 18:59:22 charm kernel: [120648.493565] CPU 0: Hot: hi: 186, btch: 31 usd: 0 Cold: hi: 62, btch: 15 usd: 0
Feb 17 18:59:22 charm kernel: [120648.493569] CPU 1: Hot: hi: 186, btch: 31 usd: 0 Cold: hi: 62, btch: 15 usd: 0
Feb 17 18:59:22 charm kernel: [120648.493572] HighMem per-cpu:
Feb 17 18:59:22 charm kernel: [120648.493575] CPU 0: Hot: hi: 186, btch: 31 usd: 0 Cold: hi: 62, btch: 15 usd: 0
Feb 17 18:59:22 charm kernel: [120648.493578] CPU 1: Hot: hi: 186, btch: 31 usd: 0 Cold: hi: 62, btch: 15 usd: 0
Feb 17 18:59:22 charm kernel: [120648.493582] Active:177754 inactive:36764 dirty:70 writeback:0 unstable:0
Feb 17 18:59:22 charm kernel: [120648.493584] free:96867 slab:194280 mapped:22909 pagetables:855 bounce:0
Feb 17 18:59:22 charm kernel: [120648.493588] DMA free:3552kB min:68kB low:84kB high:100kB active:0kB inactive:0kB present:16256kB pages_scanned:0 all_unreclaimable? no
Feb 17 18:59:22 charm kernel: [120648.493591] lowmem_reserve[]: 0 873 2016 2016
Feb 17 18:59:22 charm kernel: [120648.493597] Normal free:100832kB min:3744kB low:4680kB high:5616kB active:108kB inactive:0kB present:894080kB pages_scanned:0 all_unreclaimable? no
Feb 17 18:59:22 charm kernel: [120648.493601] lowmem_reserve[]: 0 0 9143 9143
Feb 17 18:59:22 charm kernel: [120648.493608] HighMem free:283084kB min:512kB low:1736kB high:2964kB active:710908kB inactive:147056kB present:1170372kB pages_scanned:0 all_unreclaimable? no
Feb 17 18:59:22 charm kernel: [120648.493611] lowmem_reserve[]: 0 0 0 0
Feb 17 18:59:22 charm kernel: [120648.493615] DMA: 6*4kB 4*8kB 1*16kB 5*32kB 0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3560kB
Feb 17 18:59:22 charm kernel: [120648.493629] Normal: 24663*4kB 95*8kB 3*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 100836kB
Feb 17 18:59:22 charm kernel: [120648.493643] HighMem: 9667*4kB 10630*8kB 4910*16kB 858*32kB 232*64kB 188*128kB 51*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 283228kB
Feb 17 18:59:22 charm kernel: [120648.493657] Swap cache: add 90828, delete 83833, find 21986/26825, race 0+0
Feb 17 18:59:22 charm kernel: [120648.493660] Free swap = 6005904kB
Feb 17 18:59:22 charm kernel: [120648.493662] Total swap = 6080560kB
Feb 17 18:59:22 charm kernel: [120648.493665] Free swap: 6005904kB
Feb 17 18:59:22 charm kernel: [120648.502363] 524272 pages of RAM
Feb 17 18:59:22 charm kernel: [120648.502369] 294896 pages of HIGHMEM
Feb 17 18:59:22 charm kernel: [120648.502372] 10317 reserved pages
Feb 17 18:59:22 charm kernel: [120648.502373] 135455 pages shared
Feb 17 18:59:22 charm kernel: [120648.502376] 6995 pages swap cached
Feb 17 18:59:22 charm kernel: [120648.502378] 70 pages dirty
Feb 17 18:59:22 charm kernel: [120648.502380] 0 pages writeback
Feb 17 18:59:22 charm kernel: [120648.502383] 22909 pages mapped
Feb 17 18:59:22 charm kernel: [120648.502385] 194280 pages slab
Feb 17 18:59:22 charm kernel: [120648.502387] 855 pages pagetables
Feb 17 18:59:22 charm kernel: [120648.502391] Out of memory: kill process 9218 (VBoxSVC) score 148884 or a child
Feb 17 18:59:22 charm kernel: [120648.502421] Killed process 4236 (VirtualBox)
As this has been happening frequently, I contrived to record a log of the processes running shortly before the OOM killer was invoked. This revealed nothing of particular interest, except the fact that there were apparently no processes with runaway memory usage.
As indicated by the above OOM report, that event took place at 18:59:22. Attached is a tarball of ps fauxwww output, generated approximately once a second, for the entire minute of 18:59. VirtualBox (4236) is running in ps-log.1234915161 but by ps-log.1234915163 (two seconds later) it has fallen prey to the OOM killer. As the logs show, as of ps-log.1234915161 (Tue Feb 17 18:59:21 - one second before the OOM report above), no processes were using more than a few percent of memory.
When these events occur, there is visible indication from the user perspective. gnome-system-monitor shows one CPU going to 100% IO wait and the system becomes generally unresponsive for anywhere from a few seconds to a minute. It's not clear how closely the OOM kill actually correlates with the machine becoming responsive again (because I can't look at the logs to see the OOM report until the system is responsive - so the OOM kill always happens before the system becomes responsive again, but it's not clear that the OOM kill actually causes the system to become responsive again).
Perhaps the summary for this bug would be better as "System intermittently becomes unusable" since the OOM kills may just be a side-effect of some other problem.
This behavior may be new since the upgrade to linux-image-2.6.24-23-generic version 2.6.24-23.48. As far as I recall, I didn't observe it before the upgrade. I believe it was several days after the upgrade that I observed the first OOM kill, though, and the event seems to occur more frequently than that now (twice today, once yesterday).
More information about the Ubuntu-sponsors
mailing list