|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Volker Simonis's BlogNovember 2007 ArchivesTemplate- vs. C++-Interpreter shootoutPosted by simonis on November 16, 2007 at 03:22 AM | Permalink | Comments (1)The Template-Interpreter
The default interpreter that comes with the Hotspot VM is the so called "Template Interpreter". It is called template interpreter, because it is basically created at runtime (every time the Hotspot starts) from a kind of assembler templates which are translated into real machine code. Notice, that although this is code generation at runtime it should not be confused with the ability of the Hotspot to do Just In Time (JIT) compilation of computationally expensive program parts. While a JIT compiler compiles a whole method (or even more methods together if we consider inlining) into executable machine code, the template interpreter, although generated at runtime, is still just an interpreter. It interprets a Java program bytecode by bytecode. The advantage of the template interpreter approach is the fact hat most of the code that gets executed for every single bytecode is pure machine code as well as the dispatching from one bytecode to the next, which can also be done in native machine code. Moreover this technique allows a very tight adaption of the interpreter to the actual processor architecture so the same binary will still run on an old 80486 while it may well use the latest and greatest features of the newest processor generation if available. Beside the slightly increased startup time, the second drawback of the template interpreter approach is the fact that the interpreter itself is quite complicated. It requires for example a kind of builtin runtime assembler, which translates the code templates into machine code. Therfore porting the template interpreter to a new processor architecture is not an easy task and requires quite a profound knowledge of the underlying architecture. The C++-Interpreter
In the earlier Java days (around JDK 1.4) a second interpreter existed beside the template interpreter - the so called C++ Interpreter. It was probably named that way, because the main interpreter loop was implemented as a huge switch statement in C++. Despite its name however, even the C++ Interpreter isn't completely implemented in C++. It still contains large parts like for example the frame manager which are written in assembler. It doesn't rely on recursive C++ method invocations to realize function calls in Java but instead uses the frame manager just mentioned before, which controls the stack manually. But despite these issues, the C++ interpreter is probably still easier to port to a new architecture than the template interpreter. In Java 1.4 the C++ interpreter has been used for the Itanium port of the Hotspot. But after SUN abandoned the support for the Itanium architecture, it got quite silent around the C++ Interpreter although it was still present in the Hotspot sources. With the advent of OpenJDK, the demand from the developer community to get a working example of the C++ interpreter grew (see BugID: 6571248) and so the C++ interpreter was finally reactivated in build 20 of OpenJDK, (at least for the i486 and the SPARC architecture). The C++ interpreter was basically working out of the box for the 32-bit x86 debug build and for the 32-bit opt and debug build on SPARC. If you would like to try the opt build on a 32-bit x86 platform, you'll currently have to apply this small patch: bytecodeInterpreter.patch. To make the C++ interpreter 64-bit clean on SPARC, a few more changes have to made, but I succeeded to get it running (at least for the JVM98 and the DaCapo benchmark suits) by applying these patches: bytecodeInterpreter_sparc.hpp.patch, cppInterpreter_sparc.cpp.patch, parseHelper.cpp.patch. After applying the patches you can build the Hotspot VM with the C++ interpreter instead of the usual template interpreter by setting the environment variable CC_INTERP in the shell where the build is started. Template- vs. C++-Interpreter shootout
Beside the expected porting effort, performance will be probably one of the other main reasons for the decision for or against one of the two interpreters. I have therfore run the DaCapo performance test suite with both interpreters in interpreter only mode (-Xint) and in mixed mode (-Xmixed) together with the C2 server JIT compiler. The tests have been executed with a 32-bit VM on Linux/x86 and with a 32- and a 64-bit VM on Solaris/SPARC. The results can be seen in the following tables.
Although the numbers should be treated with some caution because of some possible measurements inaccuracies, all in all the results could be interpreted as follows. In interpreter mode (-Xint) the performance of the C++ interpreter varies between 35 and 50 percent of the performance of the template interpreter. In mixed mode (-Xmixed) a VM that runs with the C++ interpreter reaches from 45 up to 90 percent of the performance of a VM that runs with the template interpreter. The still sometimes huge differences between a VM with template versus one with C++ interpreter in mixed mode, where most of the "hot" code should be compiled anyway, may be in part explained by the lack of interpreter profiling in the C++ interpreter (the C++ interpreter runs with -XX:-ProfileInterpreter). This may lead to less optimal code generation but more details have to be further evaluated. If you want to get more information about the current status of the C++ interpreter, you should probably follow the C++ Interpreter threads on the OpenJDK Hotspot mailing list. You can also read Gary Bensons online diary. There he writes about his experience of porting the OpenJDK to PowerPC using the C++ interpreter. Testus Interruptus or how to avoid InterruptedExceptions in JCK testsPosted by simonis on November 13, 2007 at 12:12 PM | Permalink | Comments (5)Although JCK tests are not a regression test suite it is probably not uncommon that they (or at least a significant subset of them) are used as automated tests. To do this successfully, a number of JCK tests like for example interactive tests or tests which require a special setup have to be excluded from the test suite. This can be easily achieved with the help of exclude lists. However, even with such a setup, one still encounters infrequent, spurious test failures in random tests. Usually, these failures are not reproducible but still need a lot of attention in order to ensure that they do not signal a real problem with the tested VM. After having analyzed such failed tests over a longer period of time I realized that most of them have been caused by an InterruptedException that was not supposed to be catched within that test. Notice that while some tests explicitely report that the reason of a failure is a catched InterruptedException, other tests fail silently. In such cases, only a closer look at the test sources revealed the fact, that most of them use to call one of the Object.wait(), Thread.join() or Thread.sleep() methods which can all be interrupted and throw an InterruptedException. The only strange thing was that I couldn't identify any single call to Thread.interrupt() during the execution of these failed tests. This observation led to the assumption, that the interrupts must originate from the test agent or from the harness. Because the failures where so seldom and occurred in random tests, I assumed that they could be provoked by timeouts caused by a high machine load. But decreasing the JCK timeout factor just led to more tests that where aborted. Such test are flagged with an "Error" status in contrast to failed test which return in time but don't return the expected result (they are flagged as "Failed"). So that obviously wasn't the solution. Finally, I decided to try it the hard way and instrumented Thread.interrupt() and the two constructors of the InterruptedException class to print a timestamp and a stack trace every time they were called. After running the JCK tests with the modified VM I could identify some interesting calls to Thread.interrupt():
java.lang.Throwable: Thread.interrupt() called for Thread[Agent0,3,main]
at java.lang.Thread.interrupt(Thread.java:821)
at com.sun.javatest.agent.SocketConnection$1.timeout(SocketConnection.java:120)
at com.sun.javatest.util.Timer$1.run(Timer.java:84)
They only occurred in JCK runs with failed tests, but interestingly enough, they sometimes happened long (up to 30 minutes) before the failing test. After understanding the semantics of Thread.interrupt(), which only sets the status of a thread to "interrupted" but doesn't actively interrupt the thread, it was clear why - the corresponding thread gets interrupted for a yet still unknown reason, but the interrupted status of the thread is not cleared. Afterwards, the first test which calls one of the interruptible Object.wait(), Thread.join() or Thread.sleep() methods, will instantly raise an InterruptedException and fail badly. So I just had to find out, why somebody would want to interrupt the test thread, why this interruption happens so randomly and probably most interesting: why isn't this interrupt handled by the test that provokes the interrupt. After downloading the JTHarness sources from https://jtharness.dev.java.net/ and digging into the code, I came up with the following explanation: Below you can see a simplified view of how an agent handles request from the harness (for the full story see com/sun/javatest/agent/Agent.java). For every test, the agent establishes a new connection (an object of type SocketConnection) to the test harness. It receives the test name and parameters from the connection, executes the test and returns the result back through the connection. Listing 1: Agent in pseudocode
Agent.handleRequestsUntilClosed() {
while(!closing) {
Connection connection = nextConnection();
Task t = new Task(connection);
t.handleRequest();
// Task in pseudocode
connection.in.readRequest();
status = execute();
connection.out.sendStatus(status);
connection.out.flush();
connection.waitUntilClosed(5000/*ms*/);
close();
// Task.close in pseudocode
if (!connection.isClosed()) {
connection.close();
}
}
}
The interesting part here is the call to connection.waitUntilClosed(). The method waitUntilClosed() is declared in the Connection interface in com/sun/javatest/agent/Connection.java to potentially throw an InterruptedException. However, the implementation of the interface in com/sun/javatest/agent/SocketConnection.java doesn't throw any exceptions. It works as follows: Listing 2: waitUntilClosed() in pseudocode
SocketConnection.waitUntilClosed(int timeout) {
waitThread = Thread.currentThread();
Timer.Timeable cb = new Timer.Timeable() {
public void timeout() {
waitThread.interrupt();
socketInput.close();
socketOutput.close();
}
};
Timer.Entry e = timer.requestDelayedCallback(cb, timeout);
try {
while (true) {
int i = socketInput.read();
if (i == -1) break;
}
}
finally {
timer.cancel(e);
}
}
It first creates a Timeable object that will interrupt the thread and close the in- and output stream associated with the current socket after a given amount of timeout. Thereafter it reads from the socket input stream until it encounters an EOF condition. If the EOF is catched within the timeout period, the Timeable object is removed from the waiting timer thread and the method returns. Otherwise, if the stream was not closed by the harness within the timeout period, the timer thread will call the timeout() method of the Timeable object which in turn will call interrupt() on the waiting thread and explicitely close the socket streams. At this point, the thread waiting for the EOF condition on the socket stream, will finally get it and will return. Notice that in the latter case, the interrupt status of the thread is neither queried and transformed into an InterruptedException nor is it cleared. At the end, handleRequest(), the caller of waitUntilClosed() will call the Task.close() method that will finally close the socket. Notice that handleRequest() doesn't handle the interrupt status of the thread either, so the thread will effectively stay marked as interrupted after a timeout has happened in waitUntilClosed(). This will ultimately lead to a failure in the next JCK test that calls one of the interruptible Object.wait(), Thread.join() or Thread.sleep() methods. The solution of this problem is easy. We could either query and clear the interrupted status of the agent thread after the call to waitUntilClosed() in handleRequest()like so: Solution 1: clear thread interrupt status in handleRequest()
connection.waitUntilClosed(5000);
if (Thread.interrupted() && tracing) {
traceOut.println("Thread was interrupted - clearing interrupted status!");
}
Notice, that the call to Thread.interrupted() queries and clears the interrupted status of a thread simultaneously. A slightly more elegant solution would probably be to query and clear the interrupted status already in the finally block of the waitUntilClosed() method of SocketConnection. In the case the thread was interrupted by the timer thread because of a timeout, the interrupted status of the thread should be cleared and an InterruptedException should be thrown. This exception will then be handled correctly in Task.handleRequest(). Solution 2: clear interrupt status and throw InterruptedException in
waitUntilClosed()
finally {
timer.cancel(e);
if (Thread.interrupted() {
throw new InterruptedException();
}
...
}
With either of these two changes, the JCK tests will run more stable on machines with a high load an don't produce any spurious test failures because of InterruptedExceptions any more. Notice that there's one last caveat: for some (to me yet unknown) reason, the JTHarness 3.2.2 executes the JCK tests in a folder in a different order than the harness that comes with the JCK test suit (and which is version 3.2_2). For example the tests in /api/java_lang/management/ThreadMXBean are executed in the following order (and succeed) with the original JCK harness: api/java_lang/management/ThreadMXBean/index_TrdMBean_MB.jtr api/java_lang/management/ThreadMXBean/index_TrdMBean.jtr api/java_lang/management/ThreadMXBean/index_setTrdMBean_MB.jtr api/java_lang/management/ThreadMXBean/index_setTrdMBean.jtr api/java_lang/management/ThreadMXBean/index_infoTrdMBean_MB.jtr api/java_lang/management/ThreadMXBean/index_infoTrdMBean.jtr With the JTHarness 3.2.2, there is one test - namely TrdMBean - that will fail because the ThreadMXBean tests are executed in a different order: api/java_lang/management/ThreadMXBean/index_infoTrdMBean.jtr api/java_lang/management/ThreadMXBean/index_infoTrdMBean_MB.jtr api/java_lang/management/ThreadMXBean/index_setTrdMBean.jtr api/java_lang/management/ThreadMXBean/index_setTrdMBean_MB.jtr api/java_lang/management/ThreadMXBean/index_TrdMBean.jtr api/java_lang/management/ThreadMXBean/index_TrdMBean_MB.jtr To avoid this problem, the test can be placed in the exclude file. This can of course only be done if the JCK run is not intended for certification! If anyone of the experts knows the reason why these tests get executed in a different order by the two versions of the test harness, please comment. Although I'm referring to version 3.2.2 of JTHarness in this blog, the latest version 4.1.1 suffers from the same problem. I've therefore opened a bug report for this issue ("Bug in test agent causes random test failures (with InterruptedException)") which got the internal review ID 1109902. If you're interested in resolving this problem, you can vote for the bug once it appears in the bug database. Update (2008-07-15): thanks to Brian Kurotsuchi this problem has been resolved in the latest version 4.1.4 of JTHarness (see jtharness issue 35). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|