Skip to main content

ubuntu - Non-heap memory leak JVM




I have a glassfish v4.0 set up on Ubuntu server running on oracle java virtual machine and jvm process resident memory size (got via "top" command) grows up until jvm doesnt have memory to create new thread.



What I have:




  • VPS Server with 1Gb of ram and 1.4GHz processor (1Core)

  • Ubuntu Server 12.04

  • Java(TM) SE Runtime Environment (build 1.7.0_51-b13)

  • Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

  • Glassfish v4.0 running my Java EE webapp


  • VM runs with folowwing parameters
    -XX:MaxPermSize=200m
    -XX:PermSize=100m
    -XX:Xmx=512m (I can add all if relevant)



What`s the problem:



Ram usage (res memory) grows all the time, depending on usage 10-100m per hour until jvm cannot allocate native memory.




What have i tried:




  • I`ve lowered max heap space which only saves time until jvm crashes anyway

  • I`ve attached plumbr (https://portal.plumbr.eu/) which does not detect any memory leak in heap

  • I have also set max perm size to lower value.



I would like to have my JVM to be stable, as I measure heap space + perm gen takes only 400-600 mb while "top" command shows java process memory grows until 850mb and then kills itself. I know that JVM need more memory that perm space and heap, but do you think i have still given too much memory to heap space and perm gen?
Any help or guide will be highly appreciated.




Log output: http://pakers.lv/logs/hs_err_pid970.log
All JVM flgas: http://pakers.lv/logs/jvm_flags.txt



Update



What else I tried (based on suggestions and my own findings):




  • I have reduced and fixed heap space to 256m and then icreased while system is still stable, I noticed that maximum heap I can afford on my system is 512m and 128m perm gen space. (-Xmx512m,-Xms512m ,-XX:PermSize=128m, -XX:MaxPermSize=128m)


  • Reduced java thread size -Xss256k, I was unable to reduce it less than 218k (jvm won`t start)

  • Added -D64 so that jvm runs in 64 bit mode

  • Added -XX:+AggressiveOpts (to enable performance optimization), -XX:+UseCompressedOops (to reduce heap space memory usage), -server flag to launch jvm in server mode

  • As i have very limited heap space size I modified NewRatio to have a little bit bigger tenured generation (1/3 of heap space) -XX:NewRatio=3

  • Added diagnostic options for GC so that i can inspect OOM errors
    -XX:+PrintTenuringDistribution
    -XX:+PrintGCDetails
    -XX:+PrintGCTimeStamps
    -XX:+HeapDumpOnOutOfMemoryError
    -Xloggc:/home/myuser/garbage.log




Current status
With these changes I have finally limited resident memory (RAM usage) for java process which was my target. In my case 512m heap space + 128m perm gen space results in around 750m of resident memory of java process which is stable. Even though I still have memory problems - heap memory gets full from time to time and causes web app to freeze due to continued garbage collection, but OS does not kill the process. So i need now either increase available memory (RAM) for the system or inspect heap usage and lower my application`s footprint. As my webapp is Java EE based (with EJB) i might not be able to reduce it significantly. Anyway thanks for suggestions and feel free to share any other suggestions if any.


Answer



There are a few possibilities given what you've shared, for example:




  • a leaky JNI library, or,

  • a thread-creation leak, or


  • leaky dynamic code proxies (perm-gen leak),



but I can only make a guess because you didn't provide any log output, or indicate whether the JVM was throwing an OutOfMemoryException (OOM), or if some other fault was encountered. Nor did you mention what garbage collector was in use, though if the flags shown above are the only JVM options in use, it's the CMS collector.



The first step is to make actions of the garbage collector observable by adding these flags:



-XX:+PrintTenuringDistribution
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps

-XX:+HeapDumpOnOutOfMemoryError
-Xloggc:/path/to/garbage.log


If it is indeed an OOM, you one can analyze the heap dump with VisualVM or similar tool. I also use VisualVM to monitor GC action in-situ via JMX. Visibility to JVM internals via can be enabled by these JVM flags:



-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=4231
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false



Additional resources:





Update



The log indeed helps. Thank you. That particular log shows that it ran out of physical memory before it could grow heap to it's configured maximum. It tried to malloc ~77M and there was only ~63M physical left:





Native memory allocation (malloc) failed to allocate 77873152 bytes for committing reserved memory.



..



/proc/meminfo:
MemTotal: 1018724 kB
MemFree: 63048 kB





Here's what I would do:




  1. Reduce heap so that it "fits" on the machine. Set min and max heap to same value so
    you can tell if it will fit immediately - it won't start up if it
    doesn't fit.


  2. You could reduce the Java stack size (-Xss), but this thing
    doesn't seem to be making a whole lot of threads so the savings
    won't be more than a Mb or two. I think the default for 64-bit Linux is 256k. Reduce it too much and it'll start OOM-ing on stack allocs.


  3. Repeat test.



  4. When it's been running under load for a short while, produce an
    on-demand heap dump for differential diagnosis using
    jmap -dump:file=path_to_file .


  5. One of two things should happen: (a) if there is a leak, it will fail again eventually,
    but the type of OOM ought to be different, or (b) there isn't a leak such that GC will
    just work harder and you're done. Given that you tried that before, the former case is likely, unless your reduced max size didn't fit either.


  6. If it does OOM, compare the two dumps to see what grew using jhat or some other heap analyzer.




Good luck!



Comments

Popular posts from this blog

linux - iDRAC6 Virtual Media native library cannot be loaded

When attempting to mount Virtual Media on a iDRAC6 IP KVM session I get the following error: I'm using Ubuntu 9.04 and: $ javaws -version Java(TM) Web Start 1.6.0_16 $ uname -a Linux aud22419-linux 2.6.28-15-generic #51-Ubuntu SMP Mon Aug 31 13:39:06 UTC 2009 x86_64 GNU/Linux $ firefox -version Mozilla Firefox 3.0.14, Copyright (c) 1998 - 2009 mozilla.org On Windows + IE it (unsurprisingly) works. I've just gotten off the phone with the Dell tech support and I was told it is known to work on Linux + Firefox, albeit Ubuntu is not supported (by Dell, that is). Has anyone out there managed to mount virtual media in the same scenario?

hp proliant - Smart Array P822 with HBA Mode?

We get an HP DL360 G8 with an Smart Array P822 controller. On that controller will come a HP StorageWorks D2700 . Does anybody know, that it is possible to run the Smart Array P822 in HBA mode? I found only information about the P410i, who can run HBA. If this is not supported, what you think about the LSI 9207-8e controller? Will this fit good in that setup? The Hardware we get is used but all original from HP. The StorageWorks has 25 x 900 GB SAS 10K disks. Because the disks are not new I would like to use only 22 for raid6, and the rest for spare (I need to see if the disk count is optimal or not for zfs). It would be nice if I'm not stick to SAS in future. As OS I would like to install debian stretch with zfs 0.71 as file system and software raid. I have see that hp has an page for debian to. I would like to use hba mode because it is recommend, that zfs know at most as possible about the disk, and I'm independent from the raid controller. For us zfs have many benefits,

apache 2.2 - Server Potentially Compromised -- c99madshell

So, low and behold, a legacy site we've been hosting for a client had a version of FCKEditor that allowed someone to upload the dreaded c99madshell exploit onto our web host. I'm not a big security buff -- frankly I'm just a dev currently responsible for S/A duties due to a loss of personnel. Accordingly, I'd love any help you server-faulters could provide in assessing the damage from the exploit. To give you a bit of information: The file was uploaded into a directory within the webroot, "/_img/fck_uploads/File/". The Apache user and group are restricted such that they can't log in and don't have permissions outside of the directory from which we serve sites. All the files had 770 permissions (user rwx, group rwx, other none) -- something I wanted to fix but was told to hold off on as it wasn't "high priority" (hopefully this changes that). So it seems the hackers could've easily executed the script. Now I wasn't able