In a previous blog entry, I showed you a map of CVM. If you are a VM engineer (or someone who is doing a port of the VM), and need to do some debugging, navigating all that data structures can be pretty daunting. How do the CVM engineers do it?
History
Since the very early days, CVM was built with a bunch of utility functions that allows us to dump certain information about certain commonly used VM data structures. For example, 2 very popular examples of these are:
1. CVMconsolePrintf(), and
2. CVMdumpStack()
CVMconsolePrintf() is just like printf except that it adds some nice formating options like %O, %C, %M, %F, that prints the details of a CVMObject *, CVMClassBlock *, CVMMethodBlock *, and CVMFieldBlock * respectively. There is more, but this gives you the idea. CVMdumpStack() is used to dump the contents of the Java stack. The CVM engineers would call these utility functions from the gdb command prompt at runtime to get live information about the state of the VM and its data structures.
However, there is a problem with using these utility functions. That is, you will need to be careful how you use them. For example, if you use CVMconsolePrintf("%C", ...) with a pointer that is a CVMClassBlock *, then you may inadvertantly cause a segfault that will crash the VM. And this would mean that you could lose all the debugging state of the bug that you have spent hours or days to reproduce.
Can't we just get the VM utilities to just do all the careful checks for us automatically so that we don't make a foll of ourselves by calling the wrong call at the wrong time?
the VM Inspector
Why, yes we can ...well, to a certain extent at least. The title of this blog entry suggests that there is a module inside CVM called the VM Inspector. Actually, it isn't quite as grandiose as that.
The VM Inspector is actually just a collection of those utilities that already existed before, plus some wrappers around some of them to make them safe. It also include some other additional useful utilities that aren't normally available in a production VM build.
To use the VM inspector utilities, build CVM with CVM_INSPECTOR=true. CVM_INSPECTOR is set to true by default when you build with CVM_DEBUG=true, but you can also enable it in a non-debug build without having to pull in all the other debug code in the system.
After that, you can start CVM in a gdb session, and call some of the inspector functions from the gdb command prompt. Alternatively, you can call them from modified VM code. For a list of the available functions, check out src/share/javavm/include/inspector.h in the phoneME Advanced VM codebase.
However, with that said, you still don't know how to use these utility functions properly. But rather than writing a big user's manual for you here, let's talk about the cvmsh utility shell instead. You'll be able to get an idea of how to use these functions based on how they are used in cvmsh (see below).
the cvmsh shell
cvmsh is a shell program written in the Java programming language that is intended to help with diagnostics and inspection of VM internals when running applications. In one regard, it is a poor man's debugger/profiler. One use of it is to detect memory leaks when running applications. This can be used by application developers or class library developers who want to see what happens in terms of the VM state when Java code is executed. Because cvmsh is a Java program, it's purpose is not for debugging bugs that crashes the VM.
One advantage of using cvmsh instead calling the inspector or other CVM utility functions directly from gdb is that it is a lot more user friendly. User friendly in the sense that it is very difficult for you to accidentally crash the VM from cvmsh. It isn't user friendly in terms of presenting you with a fancy GUI with nice graphics. It's a low tech tool that I whipped up in previous years. As said earlier, it is a "poor man's debugger/profiler".
Here is a quick summary user's manual of cvmsh:
Building cvmsh
To build cvmsh, build CVM with
CVM_INSPECTOR=true added to the make command line.
CVM_INSPECTOR is true by default for
CVM_DEBUG=true builds.
However, it is possible to build CVM with
CVM_INSPECTOR=true independent of whether
CVM_DEBUG=true or not.
The CVM_INSPECTOR=true option will add inspector
code (Java and native) into the CVM binary (and possibly JAR
files). One of these classes will be the
sun.misc.VMInspector. This class is intended for
private use only.
cvmsh will be built into
testclasses.zip.
How to run cvmsh?
cvmsh only works with CVM because it relies on APIs
in sun.misc.VMInspector. It will not run with other
VMs. At the OS command prompt, run:
> cvm -cp testclasses.zip cvmsh
How to use cvmsh?
cvmsh launches into a command prompt: '>'
At the command prompt, you can enter these commands:
help
prints a list of commands that can be used.
gc
requests a full GC cycle.
memstat
prints the current memory statistics of the VM.
enableGC
enables the GC if it was previously disabled.
Does nothing if GC is already enabled.
disableGC
disables the GC if it was previously enabled.
Does nothing if GC is already disabled.
NOTE: Disabling the GC can have an adverse effect of
causing the VM to lock up. This is because GC cycles will
be blocked until GC is re-enabled. Under this condition,
even cvmsh may not be able to continue to run
for a long time. It depends on how much free memory remains
for cvmsh's use without triggering a GC.
keepObjectsAlive true|false
forces the GC to keep all objects alive regardless of whether
they are reachable or not, or revert to normal GC behavior.
true: force GC to keep objects alive.
false: allow normal GC behavior to resume.
Dumpers:
print <object address>
invokes System.out.println() on the specified object.
NOTE: Can only be called while GC is disabled.
dumpObject <object address>
dumps the contents (class,size,fields,etc) of the specified
object.
NOTE: Can only be called while GC is disabled.
NOTE: Will report an error if the specified object is not a
valid object.
dumpClassBlock <classblock address>
dumps some info about the specified classblock.
NOTE: Can only be called while GC is disabled.
NOTE: Will report an error if the specified classblock is not a
valid classblock.
dumpObjectReferences <object address>
dumps a list of references to the specified object.
NOTE: Can only be called while GC is disabled.
NOTE: Will report an error if the specified object is not a
valid object.
NOTE: If there are no references, the list will be empty.
dumpClassReferences <classname>
dumps all references to all instances of the specified class.
NOTE: Can only be called while GC is disabled.
NOTE: If the specified class is not found, it will be reported
as not loaded.
dumpClassBlocks <classname>
dumps classblock addresses for the specified class.
NOTE: Can only be called while GC is disabled.
NOTE: If the specified class is not found, it will be reported
as not loaded.
dumpHeap [simple|verbose|stats]
dumps the heap in the specified format.
If format is not specified, the default format 'simple' will
be used.
simple: dumps the number of objects in the heap.
verbose: dumps the address of each object and their class.
stats: dumps statistics about objects in the heap.
The sizes of each object is added up and a list of each type of
object (i.e. the class) is printed in order of decreasing total
consumption of memory. The more instances of a class, the more
memory it will consume. The larger the instances, the more
memory it will consume. The total consumption is a measure of
bytes consumed by all instances of each class.
The statistics are organized in 3 columns:
Column 1: The total size in bytes of memory consumed by
instances of a class.
Column 2: The number of instances of that class.
Column 3: The class signature.
NOTE: Can be called with GC enabled or disabled.
Capturing and Comparing Heap
states:
A heap state is a snapshot of all objects
that currently exist in the heap. The
objects are not copied into the snapshot.
Only their addresses are copied.
If a GC occurs and an object is moved,
its address in all the captured snapshots
will be updated accordingly to reflect
this movement.
If a GC occurs and an object is GCed, it
will be marked as having being collected in
all the snapshots.
NOTE: Hence, the contents of a snapshot
can change with each GC cycle due to object
movement or collection.
Examples of how heap snapshots can be
used:
Comparing how many and what types of
objects are created between two points
of execution. To do this, make sure to
run the VM with a large young generation
so as to allow the app to run without
triggering a GC.
> gc
> disableGC
> captureHeapState Before running app
> run <your app>
> captureHeapState After running app
> listHeapStates
List of captured heap states:
hs 2: 2 After running app
hs 1: 1 Before running app
> compareHeapState 1 2
NOTE: This example takes a look at how
much memory, how many objects, and what
type of objects were created during the
running of some application.
Looking for memory leakage through
unintentional retention of objects even
after GC cycles.
> gc
> captureHeapState 1
> run <your app>
> gc
> captureHeapState 2
> compareHeapState 1 2
NOTE: This example takes a look at
how much object and memory retention
occurs across the execution of some
application.
Technically, if the VM was in a
steady state before and after the
execution of the app, the difference
should be 0 if there is no memory
leakage or unexpected object retention.
However, be aware that running the
application may cause more system
classes to be loaded and initialized.
These system classes and
objects will not be loaded and will show
up in the difference. But
after running the application several
times, it is unlikely that
more system classes will be loaded. So,
one way to mitigate this
effect is to run the application several
times before doing these
measurements.
captureHeapState [<comment>]
captures the current heap state. The user
may provide a comment to
label the heap state. A captured heap
state is also automatically
assigned a numeric id. Heap states are
identified by their ids.
The comment is provided to help the
user remember the context under
which the heap state is captured.
Comments are optional. If a
comment is not specified, a time stamp in
milliseconds at the time
the heap state is captured will be
assigned.
releaseHeapState <id>
release the specified heap state.
releaseAllHeapStates
release all heap states.
listHeapStates
list all captured heap states that have
not been released. The list
will show the following columns:
Column 1: heap state id number.
Column 2: comment regarding the heap state.
dumpHeapState <id> [obj|class]
dumps the specified heap state sorted in
one of the following orders:
none: this is the default if
no sorting order is specified
obj: sorts in object
addresses in increasing order.
class: sorts by classblock
addresses followed by object addresses
in increasing order.
compareHeapState <id1> <id2>
compares the specified heap states and
list differing objects that
appear in the 2 heap states. Some
statistics are also listed.
For example:
> captureHeapState 1
> captureHeapState 2
> compareHeapState 1 2
Comparing heapStates 1 and 2:
hs 2: size 20: 0x2e5d54 java.lang.String@0
hs 2: size 48: 0x2e5d68 [C@0
hs 2: size 12: 0x2e5d98 cvmsh$CmdStream@0
hs 2: size 20: 0x2e5da4 java.lang.String@0
hs 2: size 20: 0x2e5db8 java.lang.String@0
Number of mismatches in heapState 1: 0 (size 0)
Number of mismatches in heapState 2: 5 (size 120)
Total number of mismatches: 5 (size 120)
Size of heapState 1: 109908
Size of heapState 2: 110028
Size difference: 120
>
First a list of objects that appear in
one heap state but not the
other will be shown. In this example,
heap state 2 is captured
after heap state 1. Hence, it follows
that heap state 2 has more
objects than heap state 1.
NOTE: The extra objects that are
contained in heap state 2 are due
in this case to objects generated by
command line input and parsing
for cvmsh.
NOTE: There are no objects that appear
in heap state 1 that aren't
in heap state 2. The only possibility of
such objects are those
that have been GCed. For brevity,
compareHeapState does not list
objects which have been GCed.
After the list, some statistics follow:
Number of mismatches in heapState <id1>:
this indicates the number of objects
that exist in the first
heap state that aren't in the second.
In this example, there
are 0 such instances because the there
are no GCs between the
capture of the 2 heap states.
Number of mismatches in heapState <id2>:
this indicates the number of objects
that exist in the second
heap state that aren't in the first.
In this example, there
are 5 such instances which are also
listed above.
Total number of mismatches:
this is the sum of the 2 mismatch
counts for heap states
<id1>
and <id2>.
Size of heapState <id1>:
this indicates the total size in bytes
of all objects allocated
in the heap at the time heap state
<id1> was captured.
Size of heapState <id2>:
this indicates the total size in bytes
of all objects allocated
in the heap at the time heap state
<id2> was captured.
Size difference:
this indicates the difference in total
size in bytes of all objects
between heap state <id1>
and <id2>.
NOTE: It is possible for the list of
mismatched objects to appear
more than the size differences shown in
the statistics. This is
because the statistics are based on total
heap sizes. There may
have been a lot of objects which were GCed
after the first heap
state and a lot more allocated before the
second heap state. The
size difference can come close to 0, and
yet the list of mismatched
objects in the 2 heap states being
compared could be large.
Usually, the objects that appear in this
list are transient objects
that will go away in a subsequent GC.
Misc utilities:
time <command>
measures the time in milliseconds sampled
around the execution of the
specified cvmsh command.
run <Java app and arguments>
synchronously runs the specified application
with the specified
arguments. When this command returns to the
prompt, the application
will have normally have completed. Strictly
speaking, it means that the
main() method of the
application has returned.
bg <Java app and arguments>
asynchronously runs the specified
application with the specified
arguments. The application will be run in a
newly created thread.
The fact that this command returns to the
prompt is no indication of
whether the app has started/completed or
not. The command prompt is
independent of the execution of the app.
NOTE: This is not an MVM solution. There
is no application context isolation here.
The app is merely running in a separate
thread. If you are running a Personal
Profile app with a window, and you click
on the Exit button on that app, it is very
likely that the app that
the app will invoke
System.exit(). This not only
cause the app to
terminate, but cvmsh as well.
This is because both shares the same
VM instance.
Last words
The VM Inspector code is just a collection of utilities that can be used to browse VM data structures and inspect the state of the VM. It is by no means exhaustive in functionality, and is not guaranteed to be bug free either. This is because it isn't an official product feature. Therefore, it has not undergone rigorous testing, and I don't get much time to work on it in my day job. However, it can still be quite useful for debugging and profiling work in the absence of more advanced tools. It was for me (which is why I put it together a few years ago).
If anyone is so inclined, please give it a try. Please also feel free to send me feedback on the tool, comments, bug fixes, and enhancements / contributions (subject to the open source governance rules of the phoneME project, of course).
In the least, I hope it'll be of some help to you in your development efforts.
Tags: CDC CVM Java J2ME JavaME phoneME phoneME
Advanced