The Source for Java Technology Collaboration
User: Password:
Register | Login help    

Search

Online Books:
java.net on MarkMail:


Why are there two of everything?

Posted by peterkessler on November 17, 2004 at 5:16 PM PST

You might have noticed that in addition to the Tiger source snapshots, we have just posted a Mustang source snapshot under the Java Research license.

So now we have two code lines being actively worked on: we continue to find bugs in Tiger and fix them in update releases, and we have on-going development in Mustang. That's bound to be confusing. I'm a HotSpot virtual machine engineer so I'm used to having two (or more) code lines in progress. But if you are going to rummage around in the HotSpot virtual machine sources (under hotspot/src), I'd like to explain why we seem to have two of a lot of things. I'll gradually work my way down through the layers of this particular onion.

  • We have to ship releases. We have Tiger and Mustang (and all the other releases) because we want to get stuff out into the hands of our users. But we're never really "done", so development is continuous while releases are periodic. What you see by looking at the current Tiger and Mustang source snapshots is that early in a release (Mustang) there isn't really that much difference between from the previous release (Tiger). But new development stopped on Tiger months ago. At any given time, we have (at least) two releases in progress, one in active development, the other(s) for bug fix updates.
  • Backward compatibility. Once you get into the sources for the virtual machine, you'll sometimes find that we often have two implementations of things. One reason for this is because we think backward compatibility is really important. While we are working on some new thing, we have to keep the old thing working, and the easiest way to do that is to keep the old thing around. While browsing through the sources, you'll find a lot of code guarded by command-line switches you probably didn't know about. (Look at all those command line switches in hotspot/src/share/runtime/globals.hpp!) Those are there so we can do A-B comparisons for functionality, conformance, performance, footprint, etc. Only when an new implementation shows itself to be compatible and substantially better than the old one do we throw the switch to use the new one. And we usually leave the switch around for a release or two in case someone wants to revert to the old behavior.
  • Dessert topping or floor wax. One of the problems with being a successful Java virtual machine is that people want to use you for everything, even things you didn't exactly anticipate. While the Java platform might have burst on the scene as a way of executing content for small applets in web browsers (and people still use it for that), people now also use it for running gigantic high-throughput applications on big multiprocessors. Of course, we want to make everyone happy, but that often means having alternate implementations inside the virtual machine. You can see this in the choice of the client versus server runtime compiler: the client runtime compiler gives good startup and modest performance, while the server runtime compiler is not as fast to start up, but the code it generates runs significantly faster. You can't use them both at the same time (yet), but if you are looking around the code base, you'll find both runtime compilers in there.
  • One size does not fit all. In that same style of offering different qualities of service, we offer something like 3 different garbage collection algorithms. The concurrent mark sweep algorithm provides lower pause times at some cost in performance, while the parallel collector offers better performance with occasional longer pauses. We're not going make that choice for our users. If you go looking for "the garbage collector", you'll be disappointed (or maybe pleasantly surprised) to find at least three of them in there.
  • We learn, but slowly. The HotSpot Java virtual machine is a work in progress. Ideas that we had a while ago might have been appropriate for then, but things change. So the source base changes too. We learn things about the interactions of the different parts of the virtual machine (basically: compiler(s), garbage collector(s), and runtime system) and we try to clean up the code. In some parts of the virtual machine, that means changing the interfaces, but our desire not to be disruptive means we often leave the old interface and implementation in place for a release or two. For example, the older garbage collectors use a "generation framework" that is extremely flexible, but has some overhead. The newest collector uses a less flexible interface that is more efficient. We won't gratuitously convert the older collectors (we'd risk breaking things, for no benefit to you, our customers), so you'll find both programming styles in there if you look.
  • It depends on your point of view. Sometimes you'll be prowling through the source and come across things that look to be two of the same thing. For example, hotspot/src/share/oops/oopsHierarchy.hpp shows what appear to be similar hierarchies for oops and klasses. But those are not alternate implementations, or us evolving the interface, or anything like that. They are two faces of the virtual machine's view of the data structures used to represent Java objects (and a few VM internal data structures). Simplifying somewhat: an oop is the Java reference to an object, whereas the klass is the way we manipulate that object from the C++ code inside the virtual machine. That's an example of where you have to be able to hold both ideas in your head at the same time, instead of looking at only the one you think you are interested in.

The HotSpot virtual machine is a collection of engineering tradeoffs and compromises. As such you will often find more than one way of doing things when you look through the sources. I hope I've clarified some of the reasons for that. If not, ask questions and I'll try to come up with answers.

Related Topics >> Virtual Machine      
Comments
Comments are listed in date ascending order (oldest first)