Java Secrets Revealed #1
I know, I know, it's been far too long since I've made an entry. My younger son is ten months old now, so I suppose I should probably stop using "new baby" as an excuse for my laziness...
Before I joined Sun, I thought I knew a lot about Java. I had been using it for a decade and had dug into its innards more times than I could count. Anytime I ran into inexplicable Swing weirdness or whatnot I wouldn't hesitate to dive into the JRE's source code and study it, or even recompile the classes with my own diagnostic code added. I wrote my own classloaders, I manipulated bytecode on the fly, I even wrote my own compiler for a JVM-targeted language. I had earned the right to call myself a guru.
Or so I thought.
Joining Sun nearly two years ago was a humbling experience. You see, it turns out that knowing a lot about Java works as a third-party developer is very different than, say, having to figure out how to rip the JRE apart and reassemble it on the fly without running programs noticing (Java Kernel, for the uninitiated). I have had to learn more about Java's inner workings than I ever really wanted to know, and maybe you'll find some of it interesting. Towards that end I'm going to pick a couple of random topics to blather about here, with the intent of hopefully making this a semi-regular feature.
Why can Java Web Start specify JRE versions, but the Java Plug-In can't?
If you have worked with both JNLP programs and applets, you are no doubt aware of the incongruities. JNLP programs can specify which JRE version they need to run with, their memory settings, command-line arguments, and so forth. Applets, on the other hand, are stuck with whichever JRE is registered with the web browser, and have no control over any JRE settings. (JRE Settings can be changed via the Java Control Panel, but cannot be specified by or for individual applets.)
The limitation arises because the JRE which handles applets runs inside the web browser. It lives within the browser process and address space, and as far as the OS is concerned is merely another chunk of the browser's code, just as with any other plug-in. And you can't simply load more than one JRE into the same OS process, because they would have conflicting symbol definitions, entry points, and so forth. It would be like trying to boot two different operating systems on the same computer, without the benefit of (very sophisticated) tools like VMWare.
To fix this, you've got to run the JRE in a separate process, but have the applets appear within the web browser window. This, of course, introduces all sorts of challenges and requires some clever engineering, but fortunately people smarter than me were assigned to the task. A group led by Ken Russell has done just that, resulting in what is officially (and wordily) named Next-Generation Javaâ„¢ Plug-In Technology.
The new plug-in behaves much more like Web Start, in that you can use JNLP files to specify JRE versions, memory settings, and command line arguments. It's smart enough to consolidate multiple applets into the same JRE if their settings are compatible, or spawn additional JREs as needed to make everyone happy. It also has some extremely cool tricks up its metaphorical sleeve which we will be revealing at JavaOne.
What is Class Data Sharing?
Prior to joining Sun, I had read a paragraph about Class Data Sharing somewhere, but didn't know much about it. Since then I have found that pretty much nobody outside of Sun seems to know anything about it either. That's a shame, because it's actually quite neat.
One of the JRE's biggest jobs when booting up is classloading. Hundreds and hundreds of classes are needed just to get the JRE up and running, and not just the obvious ones like Class, Object, and String. You're also going to need URL and its entourage (for URLClassLoader), PrintStream and related I/O classes (for System.out and System.err), lots of different collection and utility classes, reflection support, charset support, and hundreds more.
There are two huge drawbacks to this: first, the JVM has to parse the Java class file for each of these classes, as well as resolve and link the symbols, and (for commonly-used methods) compile the methods using HotSpot. And, of course, all of this work happens every time the JRE starts up. Second, because each individual JRE is parsing and possibly compiling the code independently, they all end up with their own independent copies of the resulting memory structures.
To combat this problem, Java 5 introduced a new feature called Class Data Sharing. The idea is that the JRE does all of the basic classloading and parsing just once, and stores the resulting memory structures in a file (
bin/<jvm>/classes.jsa, with jsa standing for Java Shared Archive). The next time the JRE boots, it simply maps this file into memory, and can skip all of the messy classloading. In addition to performance, another benefit is the fact that a big chunk of the mapped bytes can be shared by all running JREs, so they do not each need an independent copy of all of the code.
Of course, as with everything the devil is in the details. Some of the classes in the archive perform initialization which isn't guaranteed to alway be the same (the AWT classes, for example, will do different things depending upon your display configuration), and I'm told that there are enough such cases that the feature was a lot trickier to implement than it might sound. Plus you've got to detect the cases where the rt.jar file has been modified, or the boot class path has been overridden, or something else has changed which makes the inherent assumptions burned into the classes.jsa file incorrect, so that class data sharing can be disabled for that particular JRE invocation.
If you use
diff or a similar tool to compare JRE directories from various machines running the same JRE version, you'll most likely find that the classes.jsa files, and only those files, are different. That's because classes.jsa is actually generated on your machine, instead of packaged with the installer. One of the last things the JRE installer does is run the magic incantation
java -Xshare:dump, which causes the shared archive to be generated. That way we don't have to increase the size of the installer further, and I don't know for sure but I suspect that some aspects of the file may be machine-dependent which would necessitate this approach anyway.
Until next time...
Hopefully that little look inside wasn't too boring. Provided anyone is interested, I'll continue to share tidbits about the inner workings of Java in future installments. Unless of course I forget, or get sidetracked...