The Source for Java Technology Collaboration
User: Password:



Fabrizio Giudici's Blog

April 2008 Archives


Apple's Java 6 on Mac OS X available

Posted by fabriziogiudici on April 29, 2008 at 11:27 PM | Permalink | Comments (12)

Now the scoop is not that we had to wait 1.5 years before it to be available, but the fact that it only supports 64bit Intel processors. No support for 32bit, no support for PPC. Yeah, PPC is dead, but how many existing installations exist with PPC and 32 bit Intel? And how long you'll have to wait before there's a decent percentage of 64bit installations so you can put it as a requirement for your app? Not counting that if you bought a Mac earlier than 2 years ago, like me, you just need to buy some new gear to start developing with it. Of course this is just whining. As a lot of people will hurry to say today, life is good, Apple is nice, and there's absolutely no need to worry about.

Benchmarks and (less) surprises (?)

Posted by fabriziogiudici on April 26, 2008 at 05:46 PM | Permalink | Comments (4)

As a follow-up of my previous post, I've cleaned up my benchmark code (the log files have been reduced in size from 40MB to 800kB, some bugs have been fixed and the UUID strategy has been chosen for generating primary keys in the database, thus reducing contention chances with multiple threads).

At this point the impact of logging should be neglectable, thus the comparison makes more sense (if you haven't read my previous post, the test reads metadata from about 150+ photos and insert them into a Derby database, with both single- and multi-thread approaches; the test is run with different software combinations but always on the same hardware, my MacBook Pro first generation, 2GHz, 2GB RAM; numbers are seconds per photo, the lower the better).
Workers Linux Ubuntu 8.04               Windows XP                      Mac OS X 10.5
        Java 1.6.0_06   Java 1.5.0_15   Java 1.6.0_05   Java 1.5.0_15   SoyLatte 1.0.2  Java 1.5.0_13
-----------------------------------------------------------------------------------------------------
1       0.50/0.35       0.55/0.47       0.48/0.37       0.52/0.47       0.80/0.66       0.80/0.74
2       0.40/0.26       0.42/0.42       0.32/0.25       0.34/0.33       0.49/0.43       0.47/0.50
3       0.39/0.26       0.41/0.42       0.32/0.25       0.43/0.39       0.48/0.43       0.52/0.53
4       0.39/0.25       0.41/0.45       0.31/0.26       0.45/0.41       0.49/0.43       0.53/0.55
In this round of tests I introduced a double benchmarking for every platform, by using both the "Client" JIT (java -client) and the "Server" one (java -server).  Here are some points:
  1. I discovered my ignorance in a basic aspect of the configuration of the VM. While I presumed that by default the "Client" JIT was always selected, I learned that this isn't necessarily true since Java 5. This document clearly states that when running in machines defined as "server-class" (at least 2 processors/cores and 2GB of RAM, and my MacBook Pro falls within this class) the default JIT is the Server one. 
  2. In spite of that, the Server JIT is selected by default only on Linux. Window XP and Mac OS X still select the "Client" one. While we are somewhat used to Apple's policy of doing things differently, Windows is a bit of a surprise because it's another Sun-made VM.
  3. This explains a big deal of the differences seen in my previous round of benchmarks: Linux resulted the winner since it was the only one running with the Server JIT, which is clearly the best option since the benchmark doesn not use any guy, but does repetitive tasks.
So at least part of the surprise in the previous set of numbers has gone. But several new questions arise:
  1. Comparing Java 5 VMs omogeneously (Client JIT vs Client JIT and Server JIT vs server JIT) Mac OS X still has the worse performance.
  2. Turning the Server JIT on Mac OS X and other operating systems does almost no good: small improvements with a single processor, and no improvements at all with multiple processors (indeed, it also looks like there's a slight degradation of performance in some cases).
  3. After the verbose logging went off, SoyLatte Client JIT is no better than Apple's JVM. Server JIT for SoyLatte is marginally better, but much slower than Linux or Windows.
At first sight, one could argue that the Server JIT does not bring advantages on any operating system, but the bad performance of SoyLatte is raising doubts about Mac OS X. Is there anything specific in Mac OS X that makes things go worse (again, maybe the file system performance)? Or, given that SoyLatte is still young, a lot of optimizations have been disabled?

The instructions to reproduce this are in my previous post, this time you must use -r 5368 for checking out the blueMarine code.

In any case, at least with Windows and Linux I'm reaching my performance goal, which is 0.20 seconds per photo. Indeed, considering some known issues of my code, I should be able to make better than it.
*** Edited to add:

To make things easier to read, I'm adding a table with normalized values. The percentages indicates how slow a test runs (e.g. 100% means that it runs at half speed). Best performers are marked in green, worse performes in red; also a specific comparison among Java 5 only VMs is given. In each column pair, the "Client" JIT figure is at the left side, the "Server" JIT figure at the right.




Benchmarks and surprises

Posted by fabriziogiudici on April 14, 2008 at 02:45 PM | Permalink | Comments (6)

I'm posting some results from my latest benchmarks on the Metadata facility of blueMarine - take them with half a pinch of salt, since I've checked them but not double-checked, sorry but I have just a little time in these days. They gave some little surprise to me.

In short, the test takes about 160 photos and imports all the metadata into a Derby database (details for reproducing it below). In total, 10000+ records are imported. The tests run with a classic Master / Worker pool with 1, 2, 3 and 4 workers on a MacBook Pro 2GHz, 2GB RAM in all the combinations of (Java 5, Java 6) x (Mac OS X 10.5, Linux Ubuntu 8.04, Windows XP) - note that I've used Soylatte on Java 6 since I don't have a 64 bit computer and can't run the Java 6 previews from Apple (thanks, Apple). The benchmark figure is the number of seconds needed to import a photo, so the lower the number, the better.

Linux Windows XP Mac OS X 10.5
Java 6 Java 5 Java 6 Java 5 Java 6 (*) Java 5
0.48 0.73 0.60 0.70 0.83 1.00
0.37 0.58 0.39 0.46 0.56 0.63
0.38 - 1.13 0.58 0.39 0.52 0.60 0.68 - 1.04
0.36 - 0.76  0.60 0.48 - 0.78 0.56 0.61 1.06
(*) SoyLatte 1.0.2

Before commenting, keep in mind the following points:
  1. The tests still run with logs at maximum power (each run produces 40MB of files), so the computation is more disk intensive than it should
  2. There are some optimizations I have to do yet, for instance photos are read multiple times
  3. Tests with multiple workers have a bug so about 1% of data don't get imported (there must be a failed transaction somewhere and it doesn't get properly logged), but this low number of errors can't change numbers dramatically.
So I expect to have even lower numbers after some fixes, but the comparison is fair across all the environments. Here my points:
  1. As expected, there's no advantage in having more workers than available processors (2). In this case, tests have a huge variance, since some times they trigger a lot of contentions in the database (with a lock timing out for one second and causing the transaction to be re-run).
  2. Java 6 is really faster than Java 5 and for free.
  3. Mac OS X is clearly the loser here, even in comparison with Windows XP, especially with a single processor. Even Java 6 is not exceptional, but I presume the problem is related to the low performance of the file system, that I've already measured some time ago
Now I've got a simple question. Unfortunately I don't have a quad-core, so I can't test: is it expected that Derby scales well with four workers? Or should I put a limit to 2? When contentions on the database don't occur, it seems that even with only 2 processors a larger number of workers don't introduce significant degradation, which is promising; on the other hand, when contentions occur things are really worse. And the more workers, the more chances of having a contention.

To reproduce:
  1. Check out sources with svn co https://bluemarine.dev.java.net/svn/bluemarine/trunk/src/blueMarine-core -r 5237
  2. Check out test files with svn co https://imaging.dev.java.net/svn/imaging/trunk/www/TestSets -r 49
  3. Open the project blueMarine-core with NetBeans 6.0.1 or NetBeans 6.1
  4. Quit NetBeans just after opening the project
  5. Run from the command line ant generate-platform nbms
  6. Edit Metadata/MetadataOperations/nbproject/private/private.properties and add the line test-unit-sys-prop.testset.folder=path-to-TestSets
  7. From the command line, go to Metadata/MetadataOperations
  8. run ant test |& grep seconds 



Historic series of profiling data

Posted by fabriziogiudici on April 06, 2008 at 06:30 AM | Permalink | Comments (2)

After fighting with some showstoppers in the NetBeans Profiler (involving RCP projects) and finding a decent workaround, I've started the tuning of the Metadata facility for blueMarine. I've already done tuning in the past, of course, but I have always had some frustration in how I easily lost the traceability of the thing. I'm giving an example: a few days a go I ran the profiler and got some figures; in particular two methods were the hot spots of the test and I realized it was not good. Working on those methods, I was able to push them down in the list. To help during the job, I exported some dumps of the NetBeans Profiler, manually spotted the most important numbers and manually tracked them build after build.

xxx.png


Now the hot spots distribution is better, but there's still work to do and I won't be able to work on it before some days. When I'll put some further optimization, I will get a new list of hot spots that will look better. And so on until I feel satisfied for this interaction. Then I'll work for several weeks again on new features and bug fixing, leaving tuning alone. I'll run another tuning session probably in a couple of months, etc. By that time, it could happen that the design changes have at least partially invalidated the optimizations done in the current tuning session. Looking back at the code I wrote years ago (and some code I wrote for customers) I can see a lot of maybe-smart optimizations that now could be pretty useless, since the context where they were good has changed (often I see in customers code that this is even made worse by premature optimization, which greatly increases the chances that the selected optimizations are inappropriate).

Of course, you can deal with it: just run another session of profiling, get the new list of hot spots, etc... The point is that since I run profiling sessions every in a while, I'm likely to have forgotten a bit about the context. I can fix this by writing comments and entries in the issue tracker, but this is time expensive. My point is that it would be nice to put some automatic stuff in the CI facility. For instance, I could run a test in profiling mode, dump the hot spots data and plot a graph showing the values (percentage and time ticks) for the five methods on the top of the list and a sixth record with the data of all the rest. This would save me a lot of time and - in the spirit of CI - it would be done incrementally for each build, allowing you to relate repository modifications with sudden changes in the performance (e.g. you see that a method at a certain build has suddenly moved up to the top of the hot spots, and you could easily track the commit version and find out which changes are responsible for that.

What do you think? I don't think there's a plugin for Hudson already available, but I could make one - I think there's already a plugin for plotting graphs if you produce a file with a given format, the only thing I need is a (FLOSS) library that allows me to read a profiler dump.



The troubles with dinosaur and close technologies...

Posted by fabriziogiudici on April 03, 2008 at 06:14 AM | Permalink | Comments (2)

Adobe has just announced that it will be impossible to have the next Photoshop to support 64 bits on Mac OS X systems. The reason is that Apple notified, the past June, that it won't support 64 bits for Carbon, so Adobe has to rewrite it on Cocoa and this would affect about one million (!) LOCs.

While in the past Java has had big troubles for supporting the desktop, and nowadays still a few issues remain, it appears clear how big is its advantage in such scenarios: with Java you can get 64 bit support almost for free (supposing you have a good 64 bit VM support, which happens in Windows and Linux).

It's also appalling to me to learn that strategic software manufacturers such as Adobe (I don't think Apple would have been so successful in some marketing segments that were strategic in the past without the contributions of superb products such as the Adobe Suite) don't get early warnings from Apple about its close technologies such Carbon and Cocoa (Adobe's statement refers to WWDC news). This makes me understand even better how important is to work with open technologies.



blueMarine goes semantic

Posted by fabriziogiudici on April 01, 2008 at 06:13 AM | Permalink | Comments (0)

Today Jazoon '08 has published the final program and I don't see my proposal there, so I've not been selected. Usual business in the conference world, sometimes you get accepted and sometimes not - it's only strange that I didn't receive any personal notification about that, it must have been lost somewhere.

In any case, the proposed paper was named "blueMarine goes semantic" and - you get it - it was about the integration of blueMarine with semantic technologies - note that I'm not strictly referring to semantic web, but semantic technologies in general. There is some preliminary code (not yet in the public source repository) and, given the Jazoon news, at this point I'm suspending the work until past JavaOne. So, this is just a teaser, expect me blogging about that after next May.

PS And not, this is not another April's fool :-)



I must admit: Eclipse is the best tool in its class

Posted by fabriziogiudici on April 01, 2008 at 12:22 AM | Permalink | Comments (5)

Well, this is the sound opinion of a person who has been using it for years. There's no history: Eclipse is just the best tool of its class. I've tried some alternatives, but nothing allowed me to operate quickly and having such a clean result as Eclipse did. Just nothing. And if you're in pain, trying to deliver when operating in the field, maybe in a problematic environment, Eclipse just gets you out of your troubles in a few minutes.

Don't trust of what competitors say, it's marketing hype, Eclipse is just the best tool of its class.

Long life to Eclipse! :-)






Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds