Skip to main content

linux - Growing OS cache in RAM causing high system CPU usage

I'm having a weird issue with a server that I've never
seen before. On a machine with ~30G of RAM with an application that takes ~10G (spread
across hundreds of processes). Over time the OS starts to fill up the spare RAM with
cache and buffers (totally normal for Linux). I've seen this happen before without any
problems, but on this machine as the amount of empty RAM decreases it drives the system
CPU crazy (100% across 8 CPUs for ~3 minutes) at about the 256M mark. I'm guessing the
OS is using all that CPU to shuffle memory around to get some free space
back.



From what I understand about Linux memory
management it's supposed to use as much free space in RAM as it can for OS level caching
but then give it over to any applications that need it when asked and from past
experience this hasn't been a traumatic experience for the CPU. It happens all the time.
So why could it be different here?



I'm attaching
a small portion of the vmstat output for the related metrics (captured every 2 secs).
You can see where the system CPU (14th column, 3rd from the right) starts getting busy
when the free memory hits ~256M and then gets really crazy about 30 secs
later.



r b swpd free buff cache si
so bi bo in cs us sy id wa

1 0 0 293876 5022848 18797528 0 0 206
1712 20924 12845 29 9 61 1
6 0 0 285324 5022848 18797656 0 0 0 0 18795 11382
23 9 68 0
2 0 0 292320 5022848 18797916 0 0 26 2022 19933 12068 27 10 62
1
3 0 0 264492 5022848 18798196 0 0 14 0 20705 15412 30 9 61 0
3 0 0
254880 5022848 18798804 0 0 190 532 16207 9723 31 8 60 0
17 0 0 255588 5021292
18783092 0 0 24 2 13521 7471 27 42 31 0
3 0 0 288396 5020536 18771496 0 0 0 2
14277 8458 24 29 47 0
4 0 0 299560 5020180 18761296 0 0 0 448 8778 5099 21 30
49 0
2 0 0 290908 5019376 18753656 0 0 0 2 9027 5115 27 19 54 0
7 0
0 306060 5018544 18746740 0 0 38 442 8398 5134 20 17 63 0

1 0 0
317140 5018244 18744252 0 0 46 0 9707 5822 22 17 61 0
4 0 0 282268 5017748
18741836 0 0 12 2 10203 6165 26 12 62 0
1 0 0 322548 5017500 18738024 0 0 2
444 10593 6277 23 16 61 0
4 0 0 314936 5017280 18734564 0 0 6 8 9473 5680 25
15 61 0
13 0 0 316976 5017044 18731128 0 0 0 622 12481 7353 33 17 49
0
5 0 0 324952 5016908 18728552 0 0 10 222 11071 6965 22 13 65 0
2 0
0 324692 5016908 18728344 0 0 0 526 10612 6602 24 10 66 0
3 0 0 312312 5017136
18727644 0 0 156 1050 12316 7472 26 10 63 1
2 1 0 323392 5017260 18726848 0 0
66 26 11643 7152 23 13 64 0
8 1 0 318956 5017124 18723772 0 0 20 518 17042
9543 31 22 46 1

1 0 0 317816 5017124 18725428 0 0 0 2854 11704 6951
21 9 67 3
18 0 0 325136 5014492 18707212 0 0 0 32 7619 3845 16 58 27
0
46 0 0 323508 5012980 18692036 0 0 0 562 3939 917 3 92 5 0
71 0 0
299164 5009680 18675476 0 0 0 6 4696 1304 8 90 1 0
75 0 0 205364 5007744
18657228 0 0 36 340 6699 2556 18 82 0 0
75 0 0 221660 5005956 18636480 0 0 68
0 3942 943 4 95 0 0
84 0 0 223788 5004624 18618380 0 0 0 0 2843 335 3 97 1
0
44 0 0 214956 5002464 18599872 0 0 0 0 4696 1301 5 92 3 0
37 0 0
223804 4999964 18577076 0 0 0 0 3281 521 1 98 0 0
82 0 0 266888 4995768
18557264 0 0 0 1760 4595 766 4 96 1 0

91 0 0 260148 4993964
18541192 0 0 0 0 3780 866 6 94 0 0
74 0 0 279796 4990464 18524980 0 0 0 4 4096
926 4 96 0 0
44 0 0 274796 4984268 18503492 0 0 0 0 6316 2142 3 95 3
0
48 0 0 295616 4981824 18482616 0 0 0 0 2561 227 1 99 1
0


I'm also including a
screenshot from the monitoring tool to show more visually what's happening with the
memory. In this graph, the bottom (purple) line is the actual free space left in RAM and
everytime it reached 256M it causes a CPU
spike.



src="https://i.stack.imgur.com/D4cVM.png" alt="enter image description
here">




BTW, swap is disabled on
this machine (if you couldn't tell from the
vmstats).




  • Linux is 3.11.0,
    Ubuntu 13.10

  • Not a Java application, it's
    PHP/Apache

Comments

Popular posts from this blog

linux - iDRAC6 Virtual Media native library cannot be loaded

When attempting to mount Virtual Media on a iDRAC6 IP KVM session I get the following error: I'm using Ubuntu 9.04 and: $ javaws -version Java(TM) Web Start 1.6.0_16 $ uname -a Linux aud22419-linux 2.6.28-15-generic #51-Ubuntu SMP Mon Aug 31 13:39:06 UTC 2009 x86_64 GNU/Linux $ firefox -version Mozilla Firefox 3.0.14, Copyright (c) 1998 - 2009 mozilla.org On Windows + IE it (unsurprisingly) works. I've just gotten off the phone with the Dell tech support and I was told it is known to work on Linux + Firefox, albeit Ubuntu is not supported (by Dell, that is). Has anyone out there managed to mount virtual media in the same scenario?

hp proliant - Smart Array P822 with HBA Mode?

We get an HP DL360 G8 with an Smart Array P822 controller. On that controller will come a HP StorageWorks D2700 . Does anybody know, that it is possible to run the Smart Array P822 in HBA mode? I found only information about the P410i, who can run HBA. If this is not supported, what you think about the LSI 9207-8e controller? Will this fit good in that setup? The Hardware we get is used but all original from HP. The StorageWorks has 25 x 900 GB SAS 10K disks. Because the disks are not new I would like to use only 22 for raid6, and the rest for spare (I need to see if the disk count is optimal or not for zfs). It would be nice if I'm not stick to SAS in future. As OS I would like to install debian stretch with zfs 0.71 as file system and software raid. I have see that hp has an page for debian to. I would like to use hba mode because it is recommend, that zfs know at most as possible about the disk, and I'm independent from the raid controller. For us zfs have many benefits,

apache 2.2 - Server Potentially Compromised -- c99madshell

So, low and behold, a legacy site we've been hosting for a client had a version of FCKEditor that allowed someone to upload the dreaded c99madshell exploit onto our web host. I'm not a big security buff -- frankly I'm just a dev currently responsible for S/A duties due to a loss of personnel. Accordingly, I'd love any help you server-faulters could provide in assessing the damage from the exploit. To give you a bit of information: The file was uploaded into a directory within the webroot, "/_img/fck_uploads/File/". The Apache user and group are restricted such that they can't log in and don't have permissions outside of the directory from which we serve sites. All the files had 770 permissions (user rwx, group rwx, other none) -- something I wanted to fix but was told to hold off on as it wasn't "high priority" (hopefully this changes that). So it seems the hackers could've easily executed the script. Now I wasn't able