I'm having a weird issue with a server that I've never
seen before. On a machine with ~30G of RAM with an application that takes ~10G (spread
across hundreds of processes). Over time the OS starts to fill up the spare RAM with
cache and buffers (totally normal for Linux). I've seen this happen before without any
problems, but on this machine as the amount of empty RAM decreases it drives the system
CPU crazy (100% across 8 CPUs for ~3 minutes) at about the 256M mark. I'm guessing the
OS is using all that CPU to shuffle memory around to get some free space
back.
From what I understand about Linux memory
management it's supposed to use as much free space in RAM as it can for OS level caching
but then give it over to any applications that need it when asked and from past
experience this hasn't been a traumatic experience for the CPU. It happens all the time.
So why could it be different here?
I'm attaching
a small portion of the vmstat output for the related metrics (captured every 2 secs).
You can see where the system CPU (14th column, 3rd from the right) starts getting busy
when the free memory hits ~256M and then gets really crazy about 30 secs
later.
r b swpd free buff cache si
so bi bo in cs us sy id wa
1 0 0 293876 5022848 18797528 0 0 206
1712 20924 12845 29 9 61 1
6 0 0 285324 5022848 18797656 0 0 0 0 18795 11382
23 9 68 0
2 0 0 292320 5022848 18797916 0 0 26 2022 19933 12068 27 10 62
1
3 0 0 264492 5022848 18798196 0 0 14 0 20705 15412 30 9 61 0
3 0 0
254880 5022848 18798804 0 0 190 532 16207 9723 31 8 60 0
17 0 0 255588 5021292
18783092 0 0 24 2 13521 7471 27 42 31 0
3 0 0 288396 5020536 18771496 0 0 0 2
14277 8458 24 29 47 0
4 0 0 299560 5020180 18761296 0 0 0 448 8778 5099 21 30
49 0
2 0 0 290908 5019376 18753656 0 0 0 2 9027 5115 27 19 54 0
7 0
0 306060 5018544 18746740 0 0 38 442 8398 5134 20 17 63 0
1 0 0
317140 5018244 18744252 0 0 46 0 9707 5822 22 17 61 0
4 0 0 282268 5017748
18741836 0 0 12 2 10203 6165 26 12 62 0
1 0 0 322548 5017500 18738024 0 0 2
444 10593 6277 23 16 61 0
4 0 0 314936 5017280 18734564 0 0 6 8 9473 5680 25
15 61 0
13 0 0 316976 5017044 18731128 0 0 0 622 12481 7353 33 17 49
0
5 0 0 324952 5016908 18728552 0 0 10 222 11071 6965 22 13 65 0
2 0
0 324692 5016908 18728344 0 0 0 526 10612 6602 24 10 66 0
3 0 0 312312 5017136
18727644 0 0 156 1050 12316 7472 26 10 63 1
2 1 0 323392 5017260 18726848 0 0
66 26 11643 7152 23 13 64 0
8 1 0 318956 5017124 18723772 0 0 20 518 17042
9543 31 22 46 1
1 0 0 317816 5017124 18725428 0 0 0 2854 11704 6951
21 9 67 3
18 0 0 325136 5014492 18707212 0 0 0 32 7619 3845 16 58 27
0
46 0 0 323508 5012980 18692036 0 0 0 562 3939 917 3 92 5 0
71 0 0
299164 5009680 18675476 0 0 0 6 4696 1304 8 90 1 0
75 0 0 205364 5007744
18657228 0 0 36 340 6699 2556 18 82 0 0
75 0 0 221660 5005956 18636480 0 0 68
0 3942 943 4 95 0 0
84 0 0 223788 5004624 18618380 0 0 0 0 2843 335 3 97 1
0
44 0 0 214956 5002464 18599872 0 0 0 0 4696 1301 5 92 3 0
37 0 0
223804 4999964 18577076 0 0 0 0 3281 521 1 98 0 0
82 0 0 266888 4995768
18557264 0 0 0 1760 4595 766 4 96 1 0
91 0 0 260148 4993964
18541192 0 0 0 0 3780 866 6 94 0 0
74 0 0 279796 4990464 18524980 0 0 0 4 4096
926 4 96 0 0
44 0 0 274796 4984268 18503492 0 0 0 0 6316 2142 3 95 3
0
48 0 0 295616 4981824 18482616 0 0 0 0 2561 227 1 99 1
0
I'm also including a
screenshot from the monitoring tool to show more visually what's happening with the
memory. In this graph, the bottom (purple) line is the actual free space left in RAM and
everytime it reached 256M it causes a CPU
spike.
src="https://i.stack.imgur.com/D4cVM.png" alt="enter image description
here">
BTW, swap is disabled on
this machine (if you couldn't tell from the
vmstats).
- Linux is 3.11.0,
Ubuntu 13.10 - Not a Java application, it's
PHP/Apache
Comments
Post a Comment