CVM Bootstrap and Initialization Process
You probably already know that CVM is written in C. So what happens from the VM's launched (invoking the C
main() function) all the way till the first line of Java code in your
main() method is executed? There is a lot going on during that period, and we usually refer to this as the VM bootstrap. This article will explain the details of the CVM bootstrapping and initialization process.
The CVM bootstrapping code is implemented in
ansiJavaMain() (ANSI compliant) that is called by the C
main(). The first thing
ansiJavaMain() does is convert and prepare the arguments for the VM. Then, it calls
JNI_CreateJavaVM() (defined in the JNI Invocation API) to create and initialize CVM. During that process, it initializes all the VM global states, including system mutexes, Java heap, threads, classloaders, etc. After the VM is created successfully, it loads the main class of the Java application and invokes the
main() Java method.
In the rest of the article, I am going to discuss the details of the CVM initialization process step by step.Preloader Initialization
CVM supports class ROMization (at build time), which loads and links class data with the VM. We sometimes also call it class preloading. One of the first few things the VM needs to do is to initialize the ROMized classes. For each ROMized class blocks (the CVMClassBlock is the main data structure for storing meta-data about a Java class), it finds their corresponding Java instances and initializes them by filling in their class block pointer, classloader pointer, etc. It also iterates over the compressed string data and constructs the matching Java instances by filling in the string value, offset and length, etc. Note these ROMized classes do not live in the Java heap.
In CVM the method data structure (CVMMethodBlock) consists of immutable and mutable parts. The immutable part includes method name and type ID, methodtable index, etc. Once initialized by JCC (the ROMizer) or the classloader, these fields are never written into. The mutable part includes the invoker pointer, compiled code start PC, etc. As part of the preloader initialization, the VM initializes the method data for each ROMized class by copying the immutable part and filling in the mutable part.
System Mutexes Initialization
As described in one of Mark Lam's blog, The Big Picture (a Map of CVM), CVM has global mutexes that control synchronization throughout the VM subsystems. These include the jit lock, heap lock, thread-list lock, class table lock, loader cache lock, global roots lock, weak global roots lock, typeid lock, etc. During the bootstrapping process, CVM creates and initializes these global locks.
Initialization GC Global Root Stacks
Then CVM initializes GC root stacks, including global roots stack (for allocating JNI global references and CVM global roots), weak global roots stack (for allocating JNI weak global references), class global roots stack, classloader global roots stack, protection domain global roots stack, and class table root stack(for all dynamically loaded classes). The global root stacks are used by GC for scanning live objects.
Typeid System Initialization
The next thing the VM does is to initialize the typeid system and registering some commonly used typeids, such as
finalize in the CVMglobals. That prevents having to repeatedly lookup these typeid later at runtime.
Classes System Initialization
Next is the class loader cache initialization. The loader cache is used to cache all
<Class, ClassLoader> pairs that are loaded in the system.
Initializing the Java Heap
-Xmx options to specify the start, minimum and maximum heap size. It extracts those size information from the VM arguments, and allocates a heap of the start size. The heap can grow up to the maximum size at runtime.
Preallocating Object Monitors and Desperation Exception Objects
The VM creates a initial number of object monitors. This is a minimum number that the VM knows that it needs. The desperation exception objects include OutOfMemoryError, StackOverflowError, etc. They come in handy in circumstances where there is no memory availabe to create the exception object.
If debugger or profiler are supported, CVM also does JVMTI related initialization.
If runtime compiler (JIT) is supported, the VM needs to do JIT related initialization, including initializing the compilation policy, compiler back end, code cache, etc.
Initializing Some System Classes
Some system classes that are ROMized need to be initialized explicitly by the VM. First, it initializes those system classes that don't require thread support by executing their static initializers. These include
java.lang.ClassLoader, etc. Then it creates the system
ThreadGroup, and main
Thread objects. After the thread initialization, the VM calls System.initializeSystemClass() to set up system properties,
stderr. During that, it also creates:
- A Reference handler thread, which enqueues pending References.
- A Finalizer thread that runs the finalizer.
The VM calls
sun.misc.CVM.parseCommandLineOptions() to parse the rest of the arguments. During which it adds user defined properties, and finds the main class name.
Now CVM is ready. To execute the
main() method, it needs to find the main Java class first. To do that it creates the system classloader (
sun.misc.Launcher$AppClassLoader) and uses the classloader to search and load the main class. After the class is loaded successfully, it invokes the
main() method and starts executing its Java code.