Skip to main content

CVM's VM Inspector

Posted by mlam on July 31, 2007 at 11:59 PM PDT

In a previous blog entry, I showed you a map of CVM. If you are a VM engineer (or someone who is doing a port of the VM), and need to do some debugging, navigating all that data structures can be pretty daunting. How do the CVM engineers do it?

History

Since the very early days, CVM was built with a bunch of utility functions that allows us to dump certain information about certain commonly used VM data structures. For example, 2 very popular examples of these are:

1. CVMconsolePrintf(), and

2. CVMdumpStack()

CVMconsolePrintf() is just like printf except that it adds some nice formating options like %O, %C, %M, %F, that prints the details of a CVMObject *, CVMClassBlock *, CVMMethodBlock *, and CVMFieldBlock * respectively. There is more, but this gives you the idea. CVMdumpStack() is used to dump the contents of the Java stack. The CVM engineers would call these utility functions from the gdb command prompt at runtime to get live information about the state of the VM and its data structures.

However, there is a problem with using these utility functions. That is, you will need to be careful how you use them. For example, if you use CVMconsolePrintf("%C", ...) with a pointer that is a CVMClassBlock *, then you may inadvertantly cause a segfault that will crash the VM. And this would mean that you could lose all the debugging state of the bug that you have spent hours or days to reproduce.

Can't we just get the VM utilities to just do all the careful checks for us automatically so that we don't make a foll of ourselves by calling the wrong call at the wrong time?

the VM Inspector

Why, yes we can ...well, to a certain extent at least. The title of this blog entry suggests that there is a module inside CVM called the VM Inspector. Actually, it isn't quite as grandiose as that.

The VM Inspector is actually just a collection of those utilities that already existed before, plus some wrappers around some of them to make them safe. It also include some other additional useful utilities that aren't normally available in a production VM build.

To use the VM inspector utilities, build CVM with CVM_INSPECTOR=true. CVM_INSPECTOR is set to true by default when you build with CVM_DEBUG=true, but you can also enable it in a non-debug build without having to pull in all the other debug code in the system.

After that, you can start CVM in a gdb session, and call some of the inspector functions from the gdb command prompt. Alternatively, you can call them from modified VM code. For a list of the available functions, check out src/share/javavm/include/inspector.h in the phoneME Advanced VM codebase.

However, with that said, you still don't know how to use these utility functions properly. But rather than writing a big user's manual for you here, let's talk about the cvmsh utility shell instead. You'll be able to get an idea of how to use these functions based on how they are used in cvmsh (see below).

the cvmsh shell

cvmsh is a shell program written in the Java programming language that is intended to help with diagnostics and inspection of VM internals when running applications. In one regard, it is a poor man's debugger/profiler. One use of it is to detect memory leaks when running applications. This can be used by application developers or class library developers who want to see what happens in terms of the VM state when Java code is executed. Because cvmsh is a Java program, it's purpose is not for debugging bugs that crashes the VM.

One advantage of using cvmsh instead calling the inspector or other CVM utility functions directly from gdb is that it is a lot more user friendly. User friendly in the sense that it is very difficult for you to accidentally crash the VM from cvmsh. It isn't user friendly in terms of presenting you with a fancy GUI with nice graphics. It's a low tech tool that I whipped up in previous years. As said earlier, it is a "poor man's debugger/profiler".

Here is a quick summary user's manual of

cvmsh:

  1. Building cvmsh

        To build cvmsh

, build CVM with
CVM_INSPECTOR=true added to the make command line.
CVM_INSPECTOR is true by default for
CVM_DEBUG=true builds.
However, it is possible to build CVM with
CVM_INSPECTOR=true independent of whether
CVM_DEBUG=true or not.

The CVM_INSPECTOR=true option will add inspector
code (Java and native) into the CVM binary (and possibly JAR
files). One of these classes will be the
sun.misc.VMInspector. This class is intended for
private use only.

cvmsh will be built into
testclasses.zip.

  • How to run cvmsh?

    cvmsh only works with CVM because it relies on APIs
    in sun.misc.VMInspector. It will not run with other
    VMs. At the OS command prompt, run:

    > cvm -cp testclasses.zip cvmsh

  • How to use cvmsh?

    cvmsh launches into a command prompt: '>'
    At the command prompt, you can enter these commands:

    • help

      prints a list of commands that can be used.

    • gc

      requests a full GC cycle.

    • memstat

      prints the current memory statistics of the VM.

    • enableGC

      enables the GC if it was previously disabled.
      Does nothing if GC is already enabled.

    • disableGC

      disables the GC if it was previously enabled.
      Does nothing if GC is already disabled.

      NOTE: Disabling the GC can have an adverse effect of
      causing the VM to lock up. This is because GC cycles will
      be blocked until GC is re-enabled. Under this condition,
      even cvmsh may not be able to continue to run
      for a long time. It depends on how much free memory remains
      for cvmsh's use without triggering a GC.

    • keepObjectsAlive true|false

      forces the GC to keep all objects alive regardless of whether
      they are reachable or not, or revert to normal GC behavior.

      true: force GC to keep objects alive.

      false: allow normal GC behavior to resume.

      Dumpers:

    • print <object address>

      invokes System.out.println() on the specified object.

      NOTE: Can only be called while GC is disabled.

    • dumpObject <object address>

      dumps the contents (class,size,fields,etc) of the specified
      object.

      NOTE: Can only be called while GC is disabled.

      NOTE: Will report an error if the specified object is not a
      valid object.

    • dumpClassBlock <classblock address>

      dumps some info about the specified classblock.

      NOTE: Can only be called while GC is disabled.

      NOTE: Will report an error if the specified classblock is not a
      valid classblock.

    • dumpObjectReferences <object address>

      dumps a list of references to the specified object.

      NOTE: Can only be called while GC is disabled.

      NOTE: Will report an error if the specified object is not a
      valid object.

      NOTE: If there are no references, the list will be empty.

    • dumpClassReferences <classname>

      dumps all references to all instances of the specified class.

      NOTE: Can only be called while GC is disabled.

      NOTE: If the specified class is not found, it will be reported
      as not loaded.

    • dumpClassBlocks <classname>

      dumps classblock addresses for the specified class.

      NOTE: Can only be called while GC is disabled.

      NOTE: If the specified class is not found, it will be reported
      as not loaded.

    • dumpHeap [simple|verbose|stats]

      dumps the heap in the specified format.

      If format is not specified, the default format 'simple' will
      be used.

      simple: dumps the number of objects in the heap.

      verbose: dumps the address of each object and their class.

      stats: dumps statistics about objects in the heap.
      The sizes of each object is added up and a list of each type of
      object (i.e. the class) is printed in order of decreasing total
      consumption of memory. The more instances of a class, the more
      memory it will consume. The larger the instances, the more
      memory it will consume. The total consumption is a measure of
      bytes consumed by all instances of each class.

      The statistics are organized in 3 columns:

      Column 1: The total size in bytes of memory consumed by
      instances of a class.

      Column 2: The number of instances of that class.

      Column 3: The class signature.

      NOTE: Can be called with GC enabled or disabled.

      Capturing and Comparing Heap
      states:

      A heap state is a snapshot of all objects
      that currently exist in the heap. The
      objects are not copied into the snapshot.
      Only their addresses are copied.

      If a GC occurs and an object is moved,
      its address in all the captured snapshots
      will be updated accordingly to reflect
      this movement.

      If a GC occurs and an object is GCed, it
      will be marked as having being collected in
      all the snapshots.

      NOTE: Hence, the contents of a snapshot
      can change with each GC cycle due to object
      movement or collection.

      Examples of how heap snapshots can be
      used:

      1. Comparing how many and what types of
        objects are created between two points
        of execution. To do this, make sure to
        run the VM with a large young generation
        so as to allow the app to run without
        triggering a GC.

                  > gc

                  > disableGC

                  > captureHeapState Before running app

                  > run <your app>

                  > captureHeapState After running app

                  > listHeapStates

                  List of captured heap states:

                    hs 2:  2    After running app

                    hs 1:  1    Before running app

                  > compareHeapState 1 2

                 

        NOTE: This example takes a look at how
        much memory, how many objects, and what
        type of objects were created during the
        running of some application.

      2. Looking for memory leakage through
        unintentional retention of objects even
        after GC cycles.

                  > gc

                  > captureHeapState 1

                  > run <your app>

                  > gc

                  > captureHeapState 2

                  > compareHeapState 1 2

                 

        NOTE: This example takes a look at
        how much object and memory retention
        occurs across the execution of some
        application.

        Technically, if the VM was in a
        steady state before and after the
        execution of the app, the difference
        should be 0 if there is no memory
        leakage or unexpected object retention.

        However, be aware that running the
        application may cause more system
        classes to be loaded and initialized.
        These system classes and
        objects will not be loaded and will show
        up in the difference. But
        after running the application several
        times, it is unlikely that
        more system classes will be loaded. So,
        one way to mitigate this
        effect is to run the application several
        times before doing these
        measurements.

    • captureHeapState [<comment>]

      captures the current heap state. The user
      may provide a comment to
      label the heap state. A captured heap
      state is also automatically
      assigned a numeric id. Heap states are
      identified by their ids.

      The comment is provided to help the
      user remember the context under
      which the heap state is captured.
      Comments are optional. If a
      comment is not specified, a time stamp in
      milliseconds at the time
      the heap state is captured will be
      assigned.

    • releaseHeapState <id>

      release the specified heap state.

    • releaseAllHeapStates

      release all heap states.

    • listHeapStates

      list all captured heap states that have
      not been released. The list
      will show the following columns:

      Column 1: heap state id number.

      Column 2: comment regarding the heap state.

    • dumpHeapState <id> [obj|class]

      dumps the specified heap state sorted in
      one of the following orders:

      none: this is the default if
      no sorting order is specified
      obj: sorts in object
      addresses in increasing order.
      class: sorts by classblock
      addresses followed by object addresses
      in increasing order.

    • compareHeapState <id1> <id2>

      compares the specified heap states and
      list differing objects that
      appear in the 2 heap states. Some
      statistics are also listed.

      For example:

              > captureHeapState 1

              > captureHeapState 2

              > compareHeapState 1 2

              Comparing heapStates 1 and 2:

                 hs 2: size 20: 0x2e5d54 java.lang.String@0

                 hs 2: size 48: 0x2e5d68 [C@0

                 hs 2: size 12: 0x2e5d98 cvmsh$CmdStream@0

                 hs 2: size 20: 0x2e5da4 java.lang.String@0

                 hs 2: size 20: 0x2e5db8 java.lang.String@0

              Number of mismatches in heapState 1: 0 (size 0)

              Number of mismatches in heapState 2: 5 (size 120)

              Total number of mismatches: 5 (size 120)

              Size of heapState 1: 109908

              Size of heapState 2: 110028

              Size difference: 120

              >

             

      First a list of objects that appear in
      one heap state but not the
      other will be shown. In this example,
      heap state 2 is captured
      after heap state 1. Hence, it follows
      that heap state 2 has more
      objects than heap state 1.

      NOTE: The extra objects that are
      contained in heap state 2 are due
      in this case to objects generated by
      command line input and parsing
      for cvmsh.

      NOTE: There are no objects that appear
      in heap state 1 that aren't
      in heap state 2. The only possibility of
      such objects are those
      that have been GCed. For brevity,
      compareHeapState does not list
      objects which have been GCed.

      After the list, some statistics follow:

      Number of mismatches in heapState <id1>:

      this indicates the number of objects
      that exist in the first
      heap state that aren't in the second.
      In this example, there
      are 0 such instances because the there
      are no GCs between the
      capture of the 2 heap states.

      Number of mismatches in heapState <id2>:

      this indicates the number of objects
      that exist in the second
      heap state that aren't in the first.
      In this example, there
      are 5 such instances which are also
      listed above.

      Total number of mismatches:

      this is the sum of the 2 mismatch
      counts for heap states

      and .

      Size of heapState <id1>:

      this indicates the total size in bytes
      of all objects allocated
      in the heap at the time heap state
      was captured.

      Size of heapState <id2>:

      this indicates the total size in bytes
      of all objects allocated
      in the heap at the time heap state
      was captured.

      Size difference:

      this indicates the difference in total
      size in bytes of all objects
      between heap state
      and .

      NOTE: It is possible for the list of
      mismatched objects to appear
      more than the size differences shown in
      the statistics. This is
      because the statistics are based on total
      heap sizes. There may
      have been a lot of objects which were GCed
      after the first heap
      state and a lot more allocated before the
      second heap state. The
      size difference can come close to 0, and
      yet the list of mismatched
      objects in the 2 heap states being
      compared could be large.
      Usually, the objects that appear in this
      list are transient objects
      that will go away in a subsequent GC.

      Misc utilities:

    • time <command>

      measures the time in milliseconds sampled
      around the execution of the
      specified cvmsh command.

    • run <Java app and arguments>

      synchronously runs the specified application
      with the specified
      arguments. When this command returns to the
      prompt, the application
      will have normally have completed. Strictly
      speaking, it means that the
      main() method of the
      application has returned.

    • bg <Java app and arguments>

      asynchronously runs the specified
      application with the specified
      arguments. The application will be run in a
      newly created thread.
      The fact that this command returns to the
      prompt is no indication of
      whether the app has started/completed or
      not. The command prompt is
      independent of the execution of the app.

      NOTE: This is not an MVM solution. There
      is no application context isolation here.
      The app is merely running in a separate
      thread. If you are running a Personal
      Profile app with a window, and you click
      on the Exit button on that app, it is very
      likely that the app that
      the app will invoke
      System.exit(). This not only
      cause the app to
      terminate, but cvmsh as well.
      This is because both shares the same
      VM instance.

      Last words

      The VM Inspector code is just a collection of utilities that can be used to browse VM data structures and inspect the state of the VM. It is by no means exhaustive in functionality, and is not guaranteed to be bug free either. This is because it isn't an official product feature. Therefore, it has not undergone rigorous testing, and I don't get much time to work on it in my day job. However, it can still be quite useful for debugging and profiling work in the absence of more advanced tools. It was for me (which is why I put it together a few years ago).

      If anyone is so inclined, please give it a try. Please also feel free to send me feedback on the tool, comments, bug fixes, and enhancements / contributions (subject to the open source governance rules of the phoneME project, of course).

      In the least, I hope it'll be of some help to you in your development efforts.


      Tags: href="http://technorati.com/tag/CVM" rel="tag">CVM href="http://technorati.com/tag/Java" rel="tag">Java href="http://technorati.com/tag/J2ME" rel="tag">J2ME href="http://technorati.com/tag/JavaME" rel="tag">JavaME href="http://technorati.com/tag/phoneME" rel="tag">phoneME href="http://technorati.com/tag/phoneME+Advanced" rel="tag">phoneME
      Advanced

      Related Topics >>