 |
Benchmarks and surprises
Posted by fabriziogiudici on April 14, 2008 at 02:45 PM | Comments (6)
I'm posting some results from my latest benchmarks on the Metadata
facility of blueMarine - take them with half a pinch of salt, since
I've checked them but not double-checked, sorry but I have just a
little time in these days. They gave some little surprise to me.
In short, the test takes about 160 photos and imports all the metadata
into a Derby database (details for reproducing it below). In
total, 10000+ records are imported. The tests run with a
classic Master / Worker pool with 1, 2, 3 and 4 workers on a MacBook
Pro 2GHz, 2GB RAM in all the combinations of (Java 5, Java 6) x (Mac OS
X 10.5, Linux Ubuntu 8.04, Windows XP) - note that I've used Soylatte
on Java 6 since I don't have a 64 bit computer and can't run the Java 6
previews from Apple (thanks, Apple). The benchmark figure is the number
of seconds needed to import a photo, so the lower the number, the
better.
| Linux |
Windows XP |
Mac OS X 10.5 |
| Java
6 |
Java
5 |
Java
6 |
Java
5 |
Java
6 (*) |
Java
5 |
| 0.48 |
0.73 |
0.60 |
0.70 |
0.83 |
1.00 |
| 0.37 |
0.58 |
0.39 |
0.46 |
0.56 |
0.63 |
| 0.38 - 1.13 |
0.58 |
0.39 |
0.52 |
0.60 |
0.68 - 1.04 |
| 0.36 - 0.76 |
0.60 |
0.48 - 0.78 |
0.56 |
0.61 |
1.06 |
(*) SoyLatte 1.0.2
Before commenting, keep in mind the following points:
- The tests still run with logs at maximum power (each run
produces 40MB of files), so the computation is more disk intensive than
it should
- There are some optimizations I have to do yet, for instance
photos are read multiple times
- Tests with multiple workers have a bug so about 1% of data
don't get imported (there must be a failed transaction somewhere and it
doesn't get properly logged), but this low number of errors can't
change numbers dramatically.
So I expect to have even lower numbers after some fixes, but the
comparison is fair across all the environments. Here my points:
- As expected, there's no advantage in having more workers
than available processors (2). In this case, tests have a huge
variance, since some times they trigger a lot of contentions in the
database (with a lock timing out for one second and causing the
transaction to be re-run).
- Java 6 is really
faster than Java 5 and for free.
- Mac OS X is clearly the loser here, even in comparison with
Windows XP, especially with a single processor. Even Java 6 is not
exceptional, but I presume the problem is related to the low
performance of the file system, that I've already
measured some time ago.
Now I've got a simple question. Unfortunately I don't have a quad-core,
so I can't test: is it expected that Derby scales well with four
workers? Or should I put a limit to 2? When contentions on the database
don't occur, it seems that even with only 2 processors a larger number
of workers don't introduce significant degradation, which is promising;
on the other hand, when contentions occur things are really worse. And
the more workers, the more chances of having a contention.
To reproduce:
- Check out sources with svn co
https://bluemarine.dev.java.net/svn/bluemarine/trunk/src/blueMarine-core
-r 5237
- Check out test files with svn co
https://imaging.dev.java.net/svn/imaging/trunk/www/TestSets -r 49
- Open the project blueMarine-core
with NetBeans 6.0.1 or NetBeans 6.1
- Quit NetBeans just after opening the project
- Run from the command line ant generate-platform nbms
- Edit Metadata/MetadataOperations/nbproject/private/private.properties
and add the line test-unit-sys-prop.testset.folder=path-to-TestSets
- From the command line, go to Metadata/MetadataOperations
- run ant
test |& grep seconds
Bookmark blog post: del.icio.us Digg DZone Furl Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment
-
Hi Fabrizio,
My experience with Java 6 vs Java 5 regarding performance is the same.
The performance improvement on Linux is definitely a winner: plain code runs faster and if we take GUI elements into account (fonts, trees, tables and textareas) it's also smoother and more accurate.
Regarding the difference between OS's, I agree the file system requires some investigations. My experience with EXT3 on Linux are terrible with DB applications. There's a high-priority writer process that just makes the system hang for seconds. Like Java's GC, but much worse :-)
"Downgrading" to EXT2 and letting the DB take care of journaling was the solution for me.
Having said this, you are in the opposite situation: Linux performs better than the other OS's. What file system do you use? Block-size? Anyway, interesting...
I can only reference an old benchmark of Linux vs BSD-derived OS's on basic memory, IPC and socket operations:
http://bulk.fefe.de/scalability/
Linux always had performance as a priority. BSD... it depends. OS X... I don't know.
Posted by: predo on April 15, 2008 at 12:09 PM
-
Thanks for the feedback. Indeed, what is frustrating with this kind of problems is that, unless you work for a corporate with a large laboratory, this is stuff that is pretty difficult to track down. I mean, I'm the first to be aware that running a few tests and comparing a few numbers is not enough unless you understand from where those number comes and, above all, you have a larger basis for experiments. Unfortunately, as a freelance, I have only a handful of computers to try with.
Coming back to your question, I've just tried Hardy Heron with the default EXT3 installation. I don't have the configuration parameters at hand, since I have to reboot to access them, I'll try to post them later.
Posted by: fabriziogiudici on April 15, 2008 at 12:21 PM
-
Both ext3 and and hfs can do journaling, and both Linux and OS-X seem to be configured by default to use it. http://en.wikipedia.org/wiki/Ext3 and http://en.wikipedia.org/wiki/HFS_Plus are some starting pointers.
However, the level of journaling seems to vary, making the comparison difficult. As to the approach for tuning performance, I would consider both theoretical issues of design and experimentation. The information on the web is vague and at times misleading. But it should be possible to do build some tuning skills in house... having enough time and configurations available :-)
Posted by: predo on April 16, 2008 at 01:15 PM
-
>The tests still run with logs at maximum power (each run produces 40MB of files), so the computation is more disk intensive than it should
40MB is a lot of text ;)
performance testes with logging enabled often give different results compared to execution under real conditions. Don't underestimate the impact of the synchronization costs (each log() or System.out.println() is synchronized). Print-out to console performance is also system dependent (and can be slow).
Posted by: mbien on April 17, 2008 at 06:00 AM
-
Hi Michael :-) Yes, I know, and that's why I'm quite optimistic about the final figure. But what I found amazing is that the logging code is of course the same, and the performance is so different in the three operating systems on the same hardware - in other words, I'm surprised by the huge difference in numbers rather than absolute numbers. In any case, in the weekend I'll re-run it with no logs.
Posted by: fabriziogiudici on April 17, 2008 at 06:16 AM
-
I've posted a folllow-up here: http://weblogs.java.net/blog/fabriziogiudici/archive/2008/04/benchmarks_and_1.html
Posted by: fabriziogiudici on April 26, 2008 at 05:47 PM
|