The Source for Java Technology Collaboration
User: Password:



Peter Kessler

Peter Kessler's Blog

Why are there two of everything?

Posted by peterkessler on November 17, 2004 at 05:16 PM | Comments (4)

You might have noticed that in addition to the Tiger source snapshots, we have just posted a Mustang source snapshot under the Java Research license.

So now we have two code lines being actively worked on: we continue to find bugs in Tiger and fix them in update releases, and we have on-going development in Mustang. That's bound to be confusing. I'm a HotSpot virtual machine engineer so I'm used to having two (or more) code lines in progress. But if you are going to rummage around in the HotSpot virtual machine sources (under hotspot/src), I'd like to explain why we seem to have two of a lot of things. I'll gradually work my way down through the layers of this particular onion.

  • We have to ship releases. We have Tiger and Mustang (and all the other releases) because we want to get stuff out into the hands of our users. But we're never really "done", so development is continuous while releases are periodic. What you see by looking at the current Tiger and Mustang source snapshots is that early in a release (Mustang) there isn't really that much difference between from the previous release (Tiger). But new development stopped on Tiger months ago. At any given time, we have (at least) two releases in progress, one in active development, the other(s) for bug fix updates.
  • Backward compatibility. Once you get into the sources for the virtual machine, you'll sometimes find that we often have two implementations of things. One reason for this is because we think backward compatibility is really important. While we are working on some new thing, we have to keep the old thing working, and the easiest way to do that is to keep the old thing around. While browsing through the sources, you'll find a lot of code guarded by command-line switches you probably didn't know about. (Look at all those command line switches in hotspot/src/share/runtime/globals.hpp!) Those are there so we can do A-B comparisons for functionality, conformance, performance, footprint, etc. Only when an new implementation shows itself to be compatible and substantially better than the old one do we throw the switch to use the new one. And we usually leave the switch around for a release or two in case someone wants to revert to the old behavior.
  • Dessert topping or floor wax. One of the problems with being a successful Java virtual machine is that people want to use you for everything, even things you didn't exactly anticipate. While the Java platform might have burst on the scene as a way of executing content for small applets in web browsers (and people still use it for that), people now also use it for running gigantic high-throughput applications on big multiprocessors. Of course, we want to make everyone happy, but that often means having alternate implementations inside the virtual machine. You can see this in the choice of the client versus server runtime compiler: the client runtime compiler gives good startup and modest performance, while the server runtime compiler is not as fast to start up, but the code it generates runs significantly faster. You can't use them both at the same time (yet), but if you are looking around the code base, you'll find both runtime compilers in there.
  • One size does not fit all. In that same style of offering different qualities of service, we offer something like 3 different garbage collection algorithms. The concurrent mark sweep algorithm provides lower pause times at some cost in performance, while the parallel collector offers better performance with occasional longer pauses. We're not going make that choice for our users. If you go looking for "the garbage collector", you'll be disappointed (or maybe pleasantly surprised) to find at least three of them in there.
  • We learn, but slowly. The HotSpot Java virtual machine is a work in progress. Ideas that we had a while ago might have been appropriate for then, but things change. So the source base changes too. We learn things about the interactions of the different parts of the virtual machine (basically: compiler(s), garbage collector(s), and runtime system) and we try to clean up the code. In some parts of the virtual machine, that means changing the interfaces, but our desire not to be disruptive means we often leave the old interface and implementation in place for a release or two. For example, the older garbage collectors use a "generation framework" that is extremely flexible, but has some overhead. The newest collector uses a less flexible interface that is more efficient. We won't gratuitously convert the older collectors (we'd risk breaking things, for no benefit to you, our customers), so you'll find both programming styles in there if you look.
  • It depends on your point of view. Sometimes you'll be prowling through the source and come across things that look to be two of the same thing. For example, hotspot/src/share/oops/oopsHierarchy.hpp shows what appear to be similar hierarchies for oops and klasses. But those are not alternate implementations, or us evolving the interface, or anything like that. They are two faces of the virtual machine's view of the data structures used to represent Java objects (and a few VM internal data structures). Simplifying somewhat: an oop is the Java reference to an object, whereas the klass is the way we manipulate that object from the C++ code inside the virtual machine. That's an example of where you have to be able to hold both ideas in your head at the same time, instead of looking at only the one you think you are interested in.

The HotSpot virtual machine is a collection of engineering tradeoffs and compromises. As such you will often find more than one way of doing things when you look through the sources. I hope I've clarified some of the reasons for that. If not, ask questions and I'll try to come up with answers.


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • Thanks for the hotspot insider information.
    Now we have one big monolithic JRE that can be started as -client and -server. Will it make sense to create two separate JRE distributions for client and server?
    Client JRE -> client JVM optimized for client-side + runtime classes needed for client
    Server JRE -> server JVM optimized for server-side + runtime classes needed for server only (no ui, media stuff)

    Posted by: sutanu on November 18, 2004 at 07:38 AM

  • If only it were that easy! We find lots of "client" customers who want the higher performance of the "server" runtime compiler, and lots of server applications want "client" features like faster startup, shorter garbage collection pauses, etc. The Java platform includes the full set of API's, so we're not about to subset those.

    The direction we are headed is rather away from what you are suggesting. We are integrating the different qualities of service into one virtual machine, and then using command line flags, heuristics, ergonomics, or dynamic monitoring and management to choose and adjust the virtual machine at runtime. The idea is that the user shouldn't have to choose, say, -client or -server. They should say what properties they want (short pauses, small footprint, etc.) and the virtual machine should choose the compiler, collector, and other options to maintain that quality over the execution of the application. We've started on that, but we still have a lot of improvements to make.

    What's the real problem you are trying to solve by suggesting breaking up the JRE? Is it the download bundle size? The runtime footprint of the JRE? Class loading speed?

    Posted by: peterkessler on November 18, 2004 at 10:27 AM

  • >> Is it the download bundle size? The runtime footprint of the JRE? Class loading speed?

    All of them. This has been raised several times before via RFE or Mustang forum. I am just thinking one way to achieve this is to really break-up the JRE into two profiles, client and server. One analogy will be the various J2ME profiles that exist now.


    When I think more about it, I see there are a few separate things to consider.

    JVM/Hotspot: The server hotspot engine will be more sophisticated than client, thus is expected to have a bigger footprint. On the other hand do we really need so many things in a client JVM? Besides a few specific usecases, can we realistically utilize so many GC schemes in client? My point is, the client hotspot can be leaner and optimized to run a client app just good enough.

    Runtime classes: The server JRE will have a smaller set of runtime classes. Why do we need awt/swing, sound, imageio etc in server? The client JRE will have a bigger set of runtime classes for UI needs.

    rt.jar: There are many ways rt.jar can be broken up without compromising the “one java platform”. Think in terms of modules which can be loaded into runtime as and only when needed. That will allow us to do incremental JRE upgrades in a more efficient manner.


    I understand that from distribution or maintenance perspective, it is more logical to maintain a single JRE for all types of deployments. But every release adds more and more new stuff, so very soon footprint and class-loading speed will become a real bottleneck. So if we are looking for more performance, and efficiency in JRE installation and upgrade, it’s time to break-up the JRE.

    Posted by: sutanu on November 18, 2004 at 11:54 AM

  • Hi Peter.

    Usually there is very little info from HostSpot team about directions and possible future optimizations of HotSpot compilers. I particularly interested in two things:

    1.) tiered compilation, which elegantly solves that "faster startup or better running speed" problem, and could even more simplify ergonomics of startin up a VM (no need to specify -client or -server anymore)

    2.) Escape analysis - either a JVM performed one (probably difficult because of dynamic nature of Java) or enabled though some programming practices or even with programmer hinting through new metadata facility

    Any plans for those two in Mustang/Dolphin timeframe?

    Posted by: selendic1 on November 19, 2004 at 03:33 AM





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds