HeapDumps: What are a they? Why would I want one? How do I generate them? How do I view one?
In response to my last post a colleague said:
Good post…..how did you create the HeapDump, and how did you view it?
Good questions.
A HeapDump is simply a dump of the live objects in a JVM’s heap at a given point in time. HeapDumps are usually used to debug memory leaks or to find the big memory consumers within a JVM. While most JVMs provide the ability to create HeapDumps the generation techniques vary from JRE to JRE.
IBM’s JRE has many options to deal with the generation and format of it’s HeapDumps, for all the details take a look at the the diagnostic guide. By default a HeapDump is only produced when an java.lang.OutOfMemoryError is hit. This can be seen by executing “java -Xdump:what”
gissel@uw-t60p:~/Workings/leak> java -Xdump:whatRegistered dump agents
———————-
dumpFn=doSystemDump
events=gpf+abort
filter=
label=/home/gissel/Workings/leak/
core.%Y%m%d.%H%M%S.%pid.dmp
range=1..0
priority=999
request=serial
opts=
———————-
dumpFn=doSnapDump
events=gpf+abort
filter=
label=/home/gissel/Workings/leak/
Snap%seq.%Y%m%d.%H%M%S.%pid.trc
range=1..0
priority=500
request=serial
opts=
———————-
dumpFn=doSnapDump
events=uncaught
filter=java/lang/OutOfMemoryError
label=/home/gissel/Workings/leak/
Snap%seq.%Y%m%d.%H%M%S.%pid.trc
range=1..4
priority=500
request=serial
opts=
———————-
dumpFn=doHeapDump
events=uncaught
filter=java/lang/OutOfMemoryError
label=/home/gissel/Workings/leak/
heapdump.%Y%m%d.%H%M%S.%pid.phd
range=1..4
priority=40
request=exclusive+prepwalk
opts=PHD
———————-
dumpFn=doJavaDump
events=gpf+user+abort
filter=
label=/home/gissel/Workings/leak/
javacore.%Y%m%d.%H%M%S.%pid.txt
range=1..0
priority=10
request=exclusive
opts=
———————-
dumpFn=doJavaDump
events=uncaught
filter=java/lang/OutOfMemoryError
label=/home/gissel/Workings/leak/
javacore.%Y%m%d.%H%M%S.%pid.txt
range=1..4
priority=10
request=exclusive
opts=
———————-
This is a good setting when running in a production environment, but not the best for debugging. When debugging I usually like to take several HeapDumps during a test, as the heap grows, to see what what type of object is growing. This is best done by sending a SIGQUIT signal to the JVM process. To enable the creation of HeapDumps via the SIGQUIT signal set the JVM option “-Xdump:heap”. The total effect of this option can be seen by executing “java -Xdump:heap”
gissel@uw-t60p:~/zero-1.0.0.P20070712-1334/apps/employee.demo> java -Xdump:heap:?
JVMDUMP000E Dump Option unrecognised: -Xdump:heap:…?Capture raw heap image:
-Xdump:heap[:defaults][:<option>=<value>, ...]
Dump options:
events=<name> Trigger dump on named events
[+<name>...] (see -Xdump:events)filter=[*]<name>[*] Filter on class (for load,throw,catch,uncaught)
#<n>..<m> Filter on exit code (for vmstop)file=<label> Output file
range=<n>..<m> Limit dumps
priority=<n> Highest first
request=<name> Request additional VM actions
[+<name>...] (see -Xdump:request)opts=PHD|CLASSIC
Default -Xdump:heap settings:
events=gpf+user
filter=
file=/home/gissel/zero-1.0.0.P20070712-1334/apps /employee.demo/
heapdump.%Y%m%d.%H%M%S.%pid.phd
range=1..0
priority=40
request=exclusive+prepwalk
opts=PHDCould not create the Java virtual machine.
An interesting option is “opts”. As seen above, the default “opts” value is PHD. PHD is actually the format of the HeapDump the other options are CLASSIC(text file), and PHD+CLASSIC (produces 2 files for each requested HeapDump, one in PHD and the other in TXT format). The HeapDump file format is important because some of the tools can only read either PHD or TXT format.
The tool I use to view HeapDumps varies depending upon which tool is most up-to-date . Currently, the first tool I use to get a general feel for what is in the heap is HeapAnalyzer. I also like HeapRoots.
Am I leaking?
Try the following test. Download the attached code, extract the archive, execute ./runscript.groovy, and monitor the resident memory used by the script. In my tests the process footprint steadily grows until the script exits. This is type of behavior is usually indicative of a memory leak; however, upon inspecting the script it doesn’t look like it should be leaking. What is going on?

A HeapDump taken during the script’s execution shows that one of org.codehaus.groovy.runtime.ReferenceMap’s HashMaps has grown to an enormous size containing thousands of entries, each key being a reference to a groovy.lang.MetaClassImpl instance. It seems that every execution of shell.evaluate creates, among other things, a new MetaClassImpl object which is subsequently stored within the ReferenceMap. The longer the script run the larger ReferenceMap grows.
These seems like a proto-typical memory leak. MetatClass objects define behavior for specific java/groovy class. Ironically most of the classes for which the MetaClass objects exist are no longer in memory thus are pure garbage. Yet, the MetaClass objects remain resident due to their existence in the ReferenceMap. A closer look at the HeapDump shows that ReferenceMap’s HashMap is a WeakHashMap whose keys are soft references. This means that the HashMap entries should eventually be collected. When the entries are collected is determined by the GC policy of the VM. By default the IBM JDK collects soft references every 32 garbage collection cycles. As it turns out 32 cycles can take a very long time. Luckily the default soft reference policy can be overridden by supplying the -Xsoftrefthreshold parameter. Voila! Setting -Xsoftrefthreshold1 eliminates the exponential process growth producing a maximum resident memory size to 39MB.
How Fast is Fast Enough?
I mentioned that I don’t believe that Java has any true scripting languages because none of its languages are useful for System Administration. One of specific flaws that all Java scripting languages share is general sluggishness. This begs the question: How fast is fast enough?

To answer the question “How fast is fast enough” the question must be put into some context. In this discussion the context is system administration. Complex language tests like mandelbrot are not suitable when speaking in terms of system administration because by their very nature shell scripts are supposed to start-up, perform a simple task, and shutdown. A more appropriate test is measuring how quickly the language can perform an extremely simple task like adding 1+2. For this type of test my threshold for “fast enough” is visually noticing a lag from the time I hit enter on the keyboard to the time the script returns. For me, this equates to about 60ms. By this measure none of the Java scripting languages are fast enough.
Taking the question from the subjective to the objective it is important to understand just how far Java’s scripting languages are from its contemporaries. So, I benchmarked start-up time for the following languages on my Thinkpad T60p with an Intel T2600 processor: BASH, python, perl, jython, jruby, and groovy. The results are listed below.
| Language | Start-up Time (ms) |
|---|---|
| BASH | 8 |
| python | 27 |
| perl | 7 |
| ruby | 10 |
| jython | 900 |
| jruby | 1900 |
| groovy | 1200 |
The number speak for themselves, Java’s scripting languages are at least an order of magnitude slower than its non Java counterpart.
Are there any true scripting languages for the JVM?
Dynamic scripting is all the buzz in the programming world and the Java community seems determined to not be left behind. To keep pace with the likes of Ruby, Java is introducing dynamic languages at a rapid pass. Some of the more popular languages are: Jacl, Jython, BeanShell, Groovy, and Mozilla Rhino. There is even a JSR, 223, aimed at making scripting a first class citizen for the Java Platform.
Given the large number of dynamic languages for the Java developer to choose from one wonders if any of them are true scripting languages? Not in my mind. For something to be a true scripting language it must be useful for System Administration. Perl, Python, Bash, C shell, Korn shell and Bourne shell (among others) all pass this test while the Java scripting languages do not. All the JVM based languages are missing two key elements to be useful for System Administration: lightning fast start-up, and low level hooks to the Operating System.
Far be it from me to pose a problem without offering a solution. So, first we need to hook the JVM into the OS. Lucky there are several pre-built packages that get us most of the way there; including: Jtux and Posix for Java. Now we need a lightning fast JVM. Hmm, that is a little harder to come by. Is 50% a passing grade?
Benchmarking Cold Start on SLES 10
Yesterday I began benchmarking an application’s cold start (after reboot) performance on SLES 10, and ran into an interesting problem. The results of my tests were varying wildly from between 2 to 3 seconds start-up time and 10-14MB of memory usage. After doing some investigating I found that the “zmd” process (zmd /usr/lib/zmd/zmd.exe) was getting kicked off at init time and eating a bunch of my CPU (10-50%) during my tests. I subsequently disabled zmd via “chkconfig –level 345 novell-zmd off”, rebooted and life was good. The memory used on the box post start-up went from 107MB to 98MB and the machine was quiet after the init sequence finished . Most importantly; however, I am now able to get repeatable results.
As an aside, I realize that disabling the Zen Network Manager impairs my ability to update and add software to my system but that begs the question, do I really need to be automatically notified of updates on my server machine? For me, the answer is no. Also, one wonders why Novell chose to write zmd in C#, an “exe” process running on my Linux box??

