I am running an Amazon AWS Micro Linux Instance and it crashed during some "light" usage a few days ago. I am running an app that uploads photos to the server. We had maybe 10 users uploading multiple photos during a 1 hour period.
At some point, the server stopped responding. I logged into the AWS Console and found that the "Intstance Reachability Check" had failed. I rebooted the server, restarted PHP and MySQL, and then had to repair a few MySQL tables that had been corrupted.
I had the monitoring tools turned on and the CPU Usage indicates that we topped out at 28% CPU Usage - After reading some more documentation about Micro Instances, I do not think we maxed out on the CPU, but I could be wrong.
I don't know a enough to understand what the logs mean. I have found what I believe to be the logs from the server from the time where the problem occurred, I am hoping that someone can help me decipher what happened:
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157082] httpd
invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157100] httpd
cpuset=/ mems_allowed=0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157108] Pid: 1824,
comm: httpd Not tainted 2.6.35.14-97.44.amzn1.x86_64 #1
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157117] Call Trace:
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157131]
[] ? cpuset_print_task_mems_allowed+0x98/0xa0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157143]
[] dump_header.clone.1+0x77/0x1a0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157153]
[] ? _raw_spin_unlock_irqrestore+0x19/0x20
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157164]
[] ? ___ratelimit+0x9f/0x120
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157172]
[] oom_kill_process.clone.0+0x76/0x140
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157185]
[] __out_of_memory+0x118/0x190
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157196]
[] out_of_memory+0x82/0x1c0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157204]
[] __alloc_pages_nodemask+0x689/0x6a0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157215]
[] alloc_pages_current+0x94/0xf0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157223]
[] __page_cache_alloc+0x7f/0x90
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157232]
[] __do_page_cache_readahead+0xc0/0x200
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157240]
[] ra_submit+0x1c/0x20
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157248]
[] filemap_fault+0x3e3/0x430
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157257]
[] __do_fault+0x4f/0x4b0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157265]
[] handle_mm_fault+0x1b4/0xb40
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157275]
[] ? xen_clocksource_read+0x21/0x30
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157283]
[] ? xen_clocksource_get_cycles+0x9/0x10
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157293]
[] ? ktime_get_ts+0xa3/0xd0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157302]
[] do_page_fault+0x112/0x310
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157310]
[] page_fault+0x25/0x30
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157316] Mem-Info:
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157320] Node 0 DMA
per-cpu:
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157327] CPU 0:
hi: 0, btch: 1 usd: 0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157332] Node 0
DMA32 per-cpu:
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157339] CPU 0:
hi: 186, btch: 31 usd: 30Jul 23 00:19:07 ip-10-117-66-219 kernel:
[19699669.157347] active_anon:70901 inactive_anon:71000
isolated_anon:0Jul 23 00:19:07 ip-10-117-66-219 kernel:
[19699669.157348] active_file:13 inactive_file:28 isolated_file:0Jul
23 00:19:07 ip-10-117-66-219 kernel: [19699669.157349] unevictable:0
dirty:4 writeback:0 unstable:0Jul 23 00:19:07 ip-10-117-66-219 kernel:
[19699669.157350] free:1374 slab_reclaimable:613
slab_unreclaimable:1843
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157351] mapped:25
shmem:34 pagetables:4842 bounce:0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157430] Node 0 DMA
free:2436kB min:76kB low:92kB high:112kB active_anon:6104kB
inactive_anon:6272kB active_file:0kB inactive_file:0kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB present:15772kB mlocked:0kB
dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
slab_unreclaimable:0kB kernel_stack:0kB pagetables:20kB unstable:0kB
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157459]
lowmem_reserve[]: 0 590 590 590
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157475] Node 0
DMA32 free:3060kB min:3068kB low:3832kB high:4600kB
active_anon:277500kB inactive_anon:277728kB active_file:52kB
inactive_file:112kB unevictable:0kB isolated(anon):0k B
isolated(file):0kB present:604988kB mlocked:0kB dirty:16kB
writeback:0kB mapped:100kB shmem:136kB slab_reclaimable:2452kB
slab_unreclaimable:7372kB kernel_stack:752kB pagetables:19348kB
unstable:0kB bounce:0kB writeback_tmp:0kB pa ges_scanned:256
all_unreclaimable? yes
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157503]
lowmem_reserve[]: 0 0 0 0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157518] Node 0 DMA:
1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 1*128kB 1*256kB 0*512kB 0*1024kB
1*2048kB 0*4096kB = 2436kB
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157553] Node 0
DMA32: 729*4kB 4*8kB 5*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB
0*1024kB 0*2048kB 0*4096kB = 3060kB
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157587] 80 total
pagecache pages
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157592] 0 pages in
swap cache
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157597] Swap cache
stats: add 0, delete 0, find 0/0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157602] Free swap
= 0kB
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.157607] Total swap
= 0kB
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.158050] 157439
pages RAM
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.158050] 4626 pages
reserved
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.158050] 38855 pages
shared
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.158050] 149663
pages non-shared
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.158050] Out of
memory: kill process 2135 (httpd) score 87500 or a child
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.158050] Killed
process 2135 (httpd) vsz:350000kB, anon-rss:49376kB, file-rss:0kB
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.164960] httpd
invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.164990] httpd
cpuset=/ mems_allowed=0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.164998] Pid: 1824,
comm: httpd Not tainted 2.6.35.14-97.44.amzn1.x86_64 #1
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165006] Call Trace:
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165020]
[] ? cpuset_print_task_mems_allowed+0x98/0xa0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165032]
[] dump_header.clone.1+0x77/0x1a0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165042]
[] ? _raw_spin_unlock_irqrestore+0x19/0x20
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165053]
[] ? ___ratelimit+0x9f/0x120
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165061]
[] oom_kill_process.clone.0+0x76/0x140
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165069]
[] __out_of_memory+0x118/0x190
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165077]
[] out_of_memory+0x82/0x1c0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165086]
[] __alloc_pages_nodemask+0x689/0x6a0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165096]
[] alloc_pages_current+0x94/0xf0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165105]
[] __page_cache_alloc+0x7f/0x90
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165114]
[] __do_page_cache_readahead+0xc0/0x200
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165122]
[] ra_submit+0x1c/0x20
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165129]
[] filemap_fault+0x3e3/0x430
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165139]
[] __do_fault+0x4f/0x4b0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165147]
[] handle_mm_fault+0x1b4/0xb40
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165156]
[] ? xen_clocksource_read+0x21/0x30
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165164]
[] ? xen_clocksource_get_cycles+0x9/0x10
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165178]
[] ? ktime_get_ts+0xa3/0xd0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165187]
[] do_page_fault+0x112/0x310
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165195]
[] page_fault+0x25/0x30
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165201] Mem-Info:
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165205] Node 0 DMA
per-cpu:
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165212] CPU 0:
hi: 0, btch: 1 usd: 0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165217] Node 0
DMA32 per-cpu:
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165224] CPU 0:
hi: 186, btch: 31 usd: 130
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165232]
active_anon:70836 inactive_anon:71000 isolated_anon:0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165233]
active_file:13 inactive_file:28 isolated_file:0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165234]
unevictable:0 dirty:4 writeback:0 unstable:0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165235] free:1343
slab_reclaimable:613 slab_unreclaimable:1843
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165236] mapped:25
shmem:34 pagetables:4842 bounce:0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165262] Node 0 DMA
free:2436kB min:76kB low:92kB high:112kB active_anon:6104kB
inactive_anon:6272kB active_file:0kB inactive_file:0kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB present:15772kB mlocked:0kB
dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
slab_unreclaimable:0kB kernel_stack:0kB pagetables:20kB unstable:0kB
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165290]
lowmem_reserve[]: 0 590 590 590
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165305] Node 0
DMA32 free:2936kB min:3068kB low:3832kB high:4600kB
active_anon:277240kB inactive_anon:277728kB active_file:52kB
inactive_file:112kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:604988kB mlocked:0kB dirty:16kB
writeback:0kB mapped:100kB shmem:136kB slab_reclaimable:2452kB
slab_unreclaimable:7372kB kernel_stack:752kB pagetables:19348kB
unstable:0kB bounce:0kB writeback_tmp:0kB pa ges_scanned:256
all_unreclaimable? yes
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165334]
lowmem_reserve[]: 0 0 0 0Jul 23 00:19:07 ip-10-117-66-219 kernel:
[19699669.165348] Node 0 DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 1*128kB
1*25 6kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 2436kBJul 23 00:19:07
ip-10-117-66-219 kernel: [19699669.165420] Node 0 DMA32: 710*4kB 4*8kB
4*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 2936kB
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165648] 80 total
pagecache pages
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165648] 0 pages in
swap cache
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165648] Swap cache
stats: add 0, delete 0, find 0/0
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165648] Free swap
= 0kB
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165648] Total swap
= 0kB
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165648] 157439
pages RAM
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165648] 4626 pages
reserved
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165648] 37922 pages
shared
Jul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165648] 149594
pages non-sharedJul 23 00:19:07 ip-10-117-66-219 kernel:
[19699669.165648] Out of memory: kill process 1859 (httpd) score 87366
or a childJul 23 00:19:07 ip-10-117-66-219 kernel: [19699669.165648]
Killed process 1859 (httpd) vsz:349464kB, anon-rss:49312k B,
file-rss:0kB
Please let me know if there is anything additional that I can provide.
Answer
The Linux out-of-memory killer felt obliged to free some RAM. Which it then attempted to do (httpd being the logged target). However, it looks like it was not successful enough and ended up locking up hard, which would not get logged since the server was out of RAM and thus didn't have memory for log-buffers.
Comments
Post a Comment