Skip to main content

linux - Growing OS cache in RAM causing high system CPU usage

I'm having a weird issue with a server that I've never seen before. On a machine with ~30G of RAM with an application that takes ~10G (spread across hundreds of processes). Over time the OS starts to fill up the spare RAM with cache and buffers (totally normal for Linux). I've seen this happen before without any problems, but on this machine as the amount of empty RAM decreases it drives the system CPU crazy (100% across 8 CPUs for ~3 minutes) at about the 256M mark. I'm guessing the OS is using all that CPU to shuffle memory around to get some free space back.



From what I understand about Linux memory management it's supposed to use as much free space in RAM as it can for OS level caching but then give it over to any applications that need it when asked and from past experience this hasn't been a traumatic experience for the CPU. It happens all the time. So why could it be different here?



I'm attaching a small portion of the vmstat output for the related metrics (captured every 2 secs). You can see where the system CPU (14th column, 3rd from the right) starts getting busy when the free memory hits ~256M and then gets really crazy about 30 secs later.



r    b   swpd  free     buff     cache     si  so  bi   bo    in     cs     us  sy   id  wa

1 0 0 293876 5022848 18797528 0 0 206 1712 20924 12845 29 9 61 1
6 0 0 285324 5022848 18797656 0 0 0 0 18795 11382 23 9 68 0
2 0 0 292320 5022848 18797916 0 0 26 2022 19933 12068 27 10 62 1
3 0 0 264492 5022848 18798196 0 0 14 0 20705 15412 30 9 61 0
3 0 0 254880 5022848 18798804 0 0 190 532 16207 9723 31 8 60 0
17 0 0 255588 5021292 18783092 0 0 24 2 13521 7471 27 42 31 0
3 0 0 288396 5020536 18771496 0 0 0 2 14277 8458 24 29 47 0
4 0 0 299560 5020180 18761296 0 0 0 448 8778 5099 21 30 49 0
2 0 0 290908 5019376 18753656 0 0 0 2 9027 5115 27 19 54 0
7 0 0 306060 5018544 18746740 0 0 38 442 8398 5134 20 17 63 0

1 0 0 317140 5018244 18744252 0 0 46 0 9707 5822 22 17 61 0
4 0 0 282268 5017748 18741836 0 0 12 2 10203 6165 26 12 62 0
1 0 0 322548 5017500 18738024 0 0 2 444 10593 6277 23 16 61 0
4 0 0 314936 5017280 18734564 0 0 6 8 9473 5680 25 15 61 0
13 0 0 316976 5017044 18731128 0 0 0 622 12481 7353 33 17 49 0
5 0 0 324952 5016908 18728552 0 0 10 222 11071 6965 22 13 65 0
2 0 0 324692 5016908 18728344 0 0 0 526 10612 6602 24 10 66 0
3 0 0 312312 5017136 18727644 0 0 156 1050 12316 7472 26 10 63 1
2 1 0 323392 5017260 18726848 0 0 66 26 11643 7152 23 13 64 0
8 1 0 318956 5017124 18723772 0 0 20 518 17042 9543 31 22 46 1

1 0 0 317816 5017124 18725428 0 0 0 2854 11704 6951 21 9 67 3
18 0 0 325136 5014492 18707212 0 0 0 32 7619 3845 16 58 27 0
46 0 0 323508 5012980 18692036 0 0 0 562 3939 917 3 92 5 0
71 0 0 299164 5009680 18675476 0 0 0 6 4696 1304 8 90 1 0
75 0 0 205364 5007744 18657228 0 0 36 340 6699 2556 18 82 0 0
75 0 0 221660 5005956 18636480 0 0 68 0 3942 943 4 95 0 0
84 0 0 223788 5004624 18618380 0 0 0 0 2843 335 3 97 1 0
44 0 0 214956 5002464 18599872 0 0 0 0 4696 1301 5 92 3 0
37 0 0 223804 4999964 18577076 0 0 0 0 3281 521 1 98 0 0
82 0 0 266888 4995768 18557264 0 0 0 1760 4595 766 4 96 1 0

91 0 0 260148 4993964 18541192 0 0 0 0 3780 866 6 94 0 0
74 0 0 279796 4990464 18524980 0 0 0 4 4096 926 4 96 0 0
44 0 0 274796 4984268 18503492 0 0 0 0 6316 2142 3 95 3 0
48 0 0 295616 4981824 18482616 0 0 0 0 2561 227 1 99 1 0


I'm also including a screenshot from the monitoring tool to show more visually what's happening with the memory. In this graph, the bottom (purple) line is the actual free space left in RAM and everytime it reached 256M it causes a CPU spike.



enter image description here




BTW, swap is disabled on this machine (if you couldn't tell from the vmstats).




  • Linux is 3.11.0, Ubuntu 13.10

  • Not a Java application, it's PHP/Apache

Comments

Popular posts from this blog

iLO 3 Firmware Update (HP Proliant DL380 G7)

The iLO web interface allows me to upload a .bin file ( Obtain the firmware image (.bin) file from the Online ROM Flash Component for HP Integrated Lights-Out. ) The iLO web interface redirects me to a page in the HP support website ( http://www.hp.com/go/iLO ) where I am supposed to find this .bin firmware, but no luck for me. The support website is a mess and very slow, badly categorized and generally unusable. Where can I find this .bin file? The only related link I am able to find asks me about my server operating system (what does this have to do with the iLO?!) and lets me download an .iso with no .bin file And also a related question: what is the latest iLO 3 version? (for Proliant DL380 G7, not sure if the iLO is tied to the server model)

linux - Awstats - outputting stats for merged Access_logs only producing stats for one server's log

I've been attempting this for two weeks and I've accessed countless number of sites on this issue and it seems there is something I'm not getting here and I'm at a lost. I manged to figure out how to merge logs from two servers together. (Taking care to only merge the matching domains together) The logs from the first server span from 15 Dec 2012 to 8 April 2014 The logs from the second server span from 2 Mar 2014 to 9 April 2014 I was able to successfully merge them using the logresolvemerge.pl script simply enermerating each log and > out_putting_it_to_file Looking at the two logs from each server the format seems exactly the same. The problem I'm having is producing the stats page for the logs. The command I've boiled it down to is /usr/share/awstats/tools/awstats_buildstaticpages.pl -configdir=/home/User/Documents/conf/ -config=example.com awstatsprog=/usr/share/awstats/wwwroot/cgi-bin/awstats.pl dir=/home/User/Documents/parced -month=all -year=all...

linux - How can I get my mediawiki to stop thinking I have cookies disabled?

I've searched half a day for how to resolve this issue, and can't figure it out. Shortly after I made my wiki a simple private wiki according to the instructions at Mediawiki's website, it started giving me this weird login error message: Wiki uses cookies to log in users. You have cookies disabled. Please enable them and try again. If I remove those private wiki settings, the error disappears, even if I try logging in. But I need it to be a private wiki for only my team. So what do I do? Here's what I've done so far. Just to be safe, after ever change, I try rebooting Apache using: sudo /etc/init.d/apache2 restart In my php.ini file, I have the following set: session.save_path = "/var/lib/php5" session.cookie_secure = secure session.cookie_path = /tmp session.cookie_domain = my server's internal URL (should I even set this? this field was blank before, but not commented out) session.referer_check = Off I ran the following to ensure that the fold...