Skip to main content

Benchmarks and (less) surprises (?)

Posted by fabriziogiudici on April 26, 2008 at 5:46 PM PDT

As a follow-up of my href="http://weblogs.java.net/blog/fabriziogiudici/">previous
post, I've cleaned up my benchmark code (the log files have
been reduced in size from 40MB to 800kB, some bugs have been fixed and
the UUID strategy has been chosen for generating primary keys in the
database, thus reducing contention chances with multiple threads).



At this point the impact of logging should be neglectable, thus the
comparison makes more sense (if you haven't read my previous post, the
test reads metadata from about 150+ photos and insert them into a Derby
database, with both single- and multi-thread approaches; the test is
run with different software combinations but always on the same
hardware, my MacBook Pro first generation, 2GHz, 2GB RAM; numbers are
seconds per photo, the lower the better).

Workers Linux Ubuntu 8.04               Windows XP                      Mac OS X 10.5
        Java 1.6.0_06   Java 1.5.0_15   Java 1.6.0_05   Java 1.5.0_15   SoyLatte 1.0.2  Java 1.5.0_13
-----------------------------------------------------------------------------------------------------
1       0.50/0.35       0.55/0.47       0.48/0.37       0.52/0.47       0.80/0.66       0.80/0.74
2       0.40/0.26       0.42/0.42       0.32/0.25       0.34/0.33       0.49/0.43       0.47/0.50
3       0.39/0.26       0.41/0.42       0.32/0.25       0.43/0.39       0.48/0.43       0.52/0.53
4       0.39/0.25       0.41/0.45       0.31/0.26       0.45/0.41       0.49/0.43       0.53/0.55

In this round of tests I introduced a double benchmarking for every
platform, by using both the "Client" JIT (java -client) and the
"Server" one (java -server).  Here are some points:

  1. I discovered my ignorance in a basic aspect of the
    configuration of the VM. While I presumed that by default the "Client"
    JIT was always selected, I learned that this isn't necessarily true
    since Java 5. href="http://java.sun.com/docs/hotspot/gc5.0/ergo5.html">This
    document clearly states that when running in machines defined
    as "server-class" (at least 2 processors/cores and 2GB of RAM, and my
    MacBook Pro falls within this class) the default JIT is the Server
    one. 
  2. In spite of that, the Server JIT is selected by
    default only on Linux. Window XP and Mac OS X still select the "Client"
    one. While we are somewhat used to Apple's policy of doing things
    differently, Windows is a bit of a surprise because it's another
    Sun-made VM.
  3. This explains a big deal of the differences seen in my
    previous round of benchmarks: Linux resulted the winner since it was
    the only one running with the Server JIT, which is clearly the best
    option since the benchmark doesn not use any guy, but does repetitive
    tasks.

So at least part of the surprise in the previous set of numbers has
gone. But several new questions arise:

  1. Comparing Java 5 VMs omogeneously (Client JIT vs Client JIT
    and Server JIT vs server JIT) Mac OS X still has the worse performance.
  2. Turning the Server JIT on Mac OS X and other operating systems does almost no good:
    small improvements with a single processor, and no improvements at all
    with multiple processors (indeed, it also looks like there's a slight
    degradation of performance in some cases).
  3. After the verbose logging went off, SoyLatte Client JIT is
    no better than Apple's JVM. Server JIT for SoyLatte is marginally
    better, but much slower than Linux or Windows.

At first sight, one could argue that the Server JIT does not bring advantages on any operating system, but the bad performance of SoyLatte is raising
doubts about Mac OS X. Is there anything specific in Mac OS X that makes things go
worse (again, maybe the file system performance)? Or, given that
SoyLatte is still young, a lot of optimizations have been disabled?



The instructions to reproduce this are in my href="http://weblogs.java.net/blog/fabriziogiudici/">previous
post, this time you must use -r 5368 for checking out the
blueMarine code.



In any case, at least
with Windows and Linux I'm reaching my performance goal, which is 0.20
seconds per photo. Indeed, considering some known issues of my code, I
should be able to make better than it.

*** Edited to add:



To make things easier to read, I'm adding a table with normalized
values. The percentages indicates how slow a test runs (e.g. 100%
means that it runs at half speed). Best performers are marked in green,
worse performes in red; also a specific comparison among Java 5 only
VMs is given. In each column pair, the "Client" JIT figure is at the
left side, the "Server" JIT figure at the right.



src="http://bluemarine.dev.java.net/nonav/Blog/20080427/Comparisons.jpg">



style="border: 0px solid ; width: 170px; height: 93px;" alt=""
src="http://java.sun.com/javaone/images/2008/170x93_Speaker_v4.gif">

Comments

I didn't test on Linux, but if your setup was a virtualized Ubuntu inside MacOSX, I wouldn't be surprised if the JVM is fooled by the virtualization and cannot distinguish a multicore chip from an SMP one... ???

On the ergonomics (server class detection): this link provides updated information for Sun JDK 6, but the basic rule remais the same (2 CPUs and 2Gb RAM). Notice though that it's "two CPUs", and I wouldn't expect that multi-core CPUs would be counted just like SMP systems. This would suck because any recent customer PC (even laptops) is multi-core, and 2Gb of RAM is also standard today (those using the latest OS's from Microsoft or Apple certainly need it...). I just checked this in two machines, one desktop and one laptop, both having Core2 Duo CPUs and 2Gb of RAM. (Both machines have dedicated graphics cards, so there's no missing RAM due to integrated graphics.) I tested JDK 1.5.0_15 and 1.6.0_10-beta-b21, both select HotSpot Client as default. So, if your MacBook is just dual-core (but not SMP) but the JVM is picking Server by default, perhaps Apple's MRJ is less smart in the ergonomics... since the MRJ is provided by Apple and not Sun, you can't trust too much on documentation from Sun JDK; ergonomics is one of these things that are completely implementation-specific, so if something works exactly identical on JVMs of other providers - even those who license Sun's sources and build on top of it like IBM, BEA and Apple - you're just lucky. For IBM JDK, there is a Diagnostics Guide that details everything about JVM configuration, tuning and troubleshooting. I guess similar docs should exist for the other providers.

Absolutely no virtualization - I never run performance tests in virtualization. I've natively rebooted in each of the three operating systems for running the test.

This is weird :-) The point is that it's not Apple's JDK to select -server, indeed it doesn't. It's Sun's JDK on my Ubuntu platform that selects -server! Did you run your test on Windows or Linux?