Benchmarks and surprises
Posted by fabriziogiudici on April 14, 2008 at 5:45 PM EDT
I'm posting some results from my latest benchmarks on the Metadata
facility of blueMarine - take them with half a pinch of salt, since
I've checked them but not double-checked, sorry but I have just a
little time in these days. They gave some little surprise to me.
In short, the test takes about 160 photos and imports all the metadata into a Derby database (details for reproducing it below). In total, 10000+ records are imported. The tests run with a classic Master / Worker pool with 1, 2, 3 and 4 workers on a MacBook Pro 2GHz, 2GB RAM in all the combinations of (Java 5, Java 6) x (Mac OS X 10.5, Linux Ubuntu 8.04, Windows XP) - note that I've used Soylatte on Java 6 since I don't have a 64 bit computer and can't run the Java 6 previews from Apple (thanks, Apple). The benchmark figure is the number of seconds needed to import a photo, so the lower the number, the better.
| Linux | Windows XP | Mac OS X 10.5 | |||
| Java 6 | Java 5 | Java 6 | Java 5 | Java 6 (*) | Java 5 |
| 0.48 | 0.73 | 0.60 | 0.70 | 0.83 | 1.00 |
| 0.37 | 0.58 | 0.39 | 0.46 | 0.56 | 0.63 |
| 0.38 - 1.13 | 0.58 | 0.39 | 0.52 | 0.60 | 0.68 - 1.04 |
| 0.36 - 0.76 | 0.60 | 0.48 - 0.78 | 0.56 | 0.61 | 1.06 |
Before commenting, keep in mind the following points:
- The tests still run with logs at maximum power (each run produces 40MB of files), so the computation is more disk intensive than it should
- There are some optimizations I have to do yet, for instance photos are read multiple times
- Tests with multiple workers have a bug so about 1% of data don't get imported (there must be a failed transaction somewhere and it doesn't get properly logged), but this low number of errors can't change numbers dramatically.
- As expected, there's no advantage in having more workers than available processors (2). In this case, tests have a huge variance, since some times they trigger a lot of contentions in the database (with a lock timing out for one second and causing the transaction to be re-run).
- Java 6 is really faster than Java 5 and for free.
- Mac OS X is clearly the loser here, even in comparison with Windows XP, especially with a single processor. Even Java 6 is not exceptional, but I presume the problem is related to the low performance of the file system, that I've already measured some time ago.
To reproduce:
- Check out sources with svn co https://bluemarine.dev.java.net/svn/bluemarine/trunk/src/blueMarine-core -r 5237
- Check out test files with svn co https://imaging.dev.java.net/svn/imaging/trunk/www/TestSets -r 49
- Open the project blueMarine-core with NetBeans 6.0.1 or NetBeans 6.1
- Quit NetBeans just after opening the project
- Run from the command line ant generate-platform nbms
- Edit Metadata/MetadataOperations/nbproject/private/private.properties and add the line test-unit-sys-prop.testset.folder=path-to-TestSets
- From the command line, go to Metadata/MetadataOperations
- run ant test |& grep seconds
Blog Links >>
- Login or register to post comments
- Printer-friendly version
- fabriziogiudici's blog
- 654 reads






Comments
by fabriziogiudici - 2008-04-26 18:47
I've posted a folllow-up here: http://weblogs.java.net/blog/fabriziogiudici/archive/2008/04/benchmarks_...by fabriziogiudici - 2008-04-17 07:16
Hi Michael :-) Yes, I know, and that's why I'm quite optimistic about the final figure. But what I found amazing is that the logging code is of course the same, and the performance is so different in the three operating systems on the same hardware - in other words, I'm surprised by the huge difference in numbers rather than absolute numbers. In any case, in the weekend I'll re-run it with no logs.by mbien - 2008-04-17 07:00
>The tests still run with logs at maximum power (each run produces 40MB of files), so the computation is more disk intensive than it should 40MB is a lot of text ;) performance testes with logging enabled often give different results compared to execution under real conditions. Don't underestimate the impact of the synchronization costs (each log() or System.out.println() is synchronized). Print-out to console performance is also system dependent (and can be slow).by predo - 2008-04-16 14:15
Both ext3 and and hfs can do journaling, and both Linux and OS-X seem to be configured by default to use it. http://en.wikipedia.org/wiki/Ext3 and http://en.wikipedia.org/wiki/HFS_Plus are some starting pointers. However, the level of journaling seems to vary, making the comparison difficult. As to the approach for tuning performance, I would consider both theoretical issues of design and experimentation. The information on the web is vague and at times misleading. But it should be possible to do build some tuning skills in house... having enough time and configurations available :-)by fabriziogiudici - 2008-04-15 13:21
Thanks for the feedback. Indeed, what is frustrating with this kind of problems is that, unless you work for a corporate with a large laboratory, this is stuff that is pretty difficult to track down. I mean, I'm the first to be aware that running a few tests and comparing a few numbers is not enough unless you understand from where those number comes and, above all, you have a larger basis for experiments. Unfortunately, as a freelance, I have only a handful of computers to try with.
Coming back to your question, I've just tried Hardy Heron with the default EXT3 installation. I don't have the configuration parameters at hand, since I have to reboot to access them, I'll try to post them later.
by predo - 2008-04-15 13:09
Hi Fabrizio,My experience with Java 6 vs Java 5 regarding performance is the same. The performance improvement on Linux is definitely a winner: plain code runs faster and if we take GUI elements into account (fonts, trees, tables and textareas) it's also smoother and more accurate.
Regarding the difference between OS's, I agree the file system requires some investigations. My experience with EXT3 on Linux are terrible with DB applications. There's a high-priority writer process that just makes the system hang for seconds. Like Java's GC, but much worse :-)
"Downgrading" to EXT2 and letting the DB take care of journaling was the solution for me.
Having said this, you are in the opposite situation: Linux performs better than the other OS's. What file system do you use? Block-size? Anyway, interesting...
I can only reference an old benchmark of Linux vs BSD-derived OS's on basic memory, IPC and socket operations:
http://bulk.fefe.de/scalability/
Linux always had performance as a priority. BSD... it depends. OS X... I don't know.