Scripting languages and Java
Mustang, a.k.a. Java SE 6, is getting ready to bolt out of its corral. We've done a lot of good things in Mustang. There's a recent article published on devx.com giving an overview of changes in Mustang, plus we've published the official docs which you can browse.
There's one new feature that I want to talk about today, and that's the support for scripting languages in Java. I have some personal interest in language interpreters, from working on several projects involving interpreters. In my college years I wrote a BASIC interpreter (it's in the comp.sources.??? archives from around 1986 as "BASIC interpreter, needs work"). In the early 90's I spent a lot of time working with TCL. And I also got to do a little work with porting Microsoft's vbscript/jscript interpreter to Unix (I used to work for Mainsoft). And then, of course, I've been with the Java team for nearly 8 yrs (to be exact, Java's not exactly interpreted).
One attractive meme in all these different languages is the ability to take the same program and execute it on multiple operating systems. With the right software. I see this as being about freedom. There's the vendor lock-in that occurs when your software is compiled to machine instructions for a given operating system. But if your software is written in a language/runtime which offers portability, you have the freedom to choose a wider set of computers and operating systems.
How is this portability implemented? There's an architectural pattern that's followed in many "interpreted" languages. The language is implemented with two major modules. The first is a compiler to translate the high level language to byte-codes for a virtual machine. The second executes those byte-codes. Of course there might be other modules, such as to interface operating system facilities to a higher level abstraction.
In a few cases a language might be interpreted directly from the program text. This was the case for early versions of TCL, where the program text was reparsed each time a statement was executed. Last time I looked the TCL runtime converted TCL expressions to a string of byte codes for execution.
Obviously there's a huge efficiency gain when a language is executed from byte codes, rather than having the program text reparsed each time a statement is executed.
In the case of Java the intermediate language is the well documented class file format. Sun's Java implementation takes the class files and Hotspot will dynamically decide which parts are executed most often, and compile those parts for the given CPU/OS currently executing. This is why I earlier said Java is pseudo-interpreted, as some parts of your application will be interpreted, and other parts will be machine code instructions running at native execution speed.
Mustang offers an exciting possibility for language authors.
Think about the architectural pattern I described. Anybody who has a hankering to write a new language has two problems. One is the language design, and the other is the most efficient way to execute programs written in that language. What Mustang offers is a way for a budding language designer to skip that second step, and just focus on their language design.
I'm not going to go into much depth with this feature, if only because I've not yet used this feature. The language author embeds their language using the ScriptEngine interface, using the service provider mechanism to tell Java about their ScriptEngine. The system allows for multiple scripting languages to run inside the same Java virtual machine.
Apparently it's a little too soon, still, for many languages to be implemented with the javax.script interface. Doing a web search I see some activity. For example BeanShell can hook up with javax.script. An exciting possility is for a PHP/javax.script implementation (see php-java-bridge).
UPDATE: Stevey's JVM Language Soko-Shootout is the beginning of some interesting research into JVM's support of other languages. Unfortunately he was hired by Google, which cut short his research.