Skip to main content

Next Generation Performance Benchmark

Posted by mkarg on January 3, 2010 at 8:44 AM PST

On last saturday I have run a few experimental benchmarks on the typical new generation technology stack (or part of it). What I exactly did was running iAnywhere 10.0.1 database and Sun Application Server 9 (aka "Glassfish" aka "Java EE 5 SDK") in a VMware Server 1.0.3 virtual machine on my private laptop (AMD Turion 64 X2, 2 GB RAM). The benchmark was done using a small test application that I wrote in less than one hour (after learning about Java EE 5) using EJB 3.0 and WebServices. There is one POJO (EJB 3 Entity) that is using AUTOINC in the DB as automatic PK generation, ontop of that a Session Beans is implementing a small WebService to allow my benchmark client to create and read rows using SOAP. Despite the fact that the code is much easier to read and write and no more DDs are needed thanks to annotations (which makes programming approximately two to three times faster -- plus removes a lot of potential bugs to fix later) it also seems to be much, much faster than our current technology stack. I was quite impressed after I noticed how fast even my laptop is, compared to the current solution:

Creation of one row: 27ms (including SOAP call, transaction start / stop, INSERT, SELECT @@identity).

Actually one could say, that is slow, since in a real world scenario we like to have about five times better performance, but hey, first of all this is a slow laptop and it is running a 4 GB VM on a 2 GB machine, second we really are doing SOAP, and third nobody would do a single transaction and server roundtrip for a single row (typically we will write several rows in one TX). So in fact I think that is an impressive demonstration that SOAP is not so slow that I always thought and that it actually is usable for real world applications. In fact, the main reason for the performance penalties was not java.exe or the dbsrv10.exe (a.k.a Appserver and DBServer) but the disk -- 50% of the time the two CPU kernels just waited for the hard disk, according to the constantly lighted HDD LED. I expect that a RAID system will serve at least 10 times faster, so I expect a server to report a value of 3 to 5ms per row (what we should try out some day with a real server hardware, just to get excited). Also it might be possible that using another ID generation strategy might be faster: AUTOINC needs to do one SELECT per row to fetch the generated ID. An ID table will be scanned only after e. g. 50 inserts (adjustable in Glassfish) so the roundtrip will be about 30 to 50% faster.

Reading of 1000 rows: between 300ms and 900ms (including SOAP call, transaction start / stop, SELECT *, transfer ALL columns to the client).

This was the test that actually impressed me a lot. With EJB 2.1 on JOnAS 4.8.4, loading 1000 rows down to the client was much slower. Between 300 and 900ms for 1000 rows means that the performance, even using a verbose and complex to parse XML based protocol like SOAP, is fast enough to load all the rows of a table in a single call / single transaction (for the average amount of data to be shown on a GUI).

Also we need to keep in mind that a real server will (a) not run on a laptop, (b) use 64 Bit database engine, (c) use a real operating system instead of XP Home Edition, (d) have lots of RAM for Caching, (e) will make use of a multi-disk RAID 5 or RAID 10 system that serves times faster than my noise- and battery-optimized 3 inch laptop drive.

Comments

New JPA Performance Benchmark

A new comprehensive benchmark that compares performance of different JPA implementations and databases has been published. See the JPA Performance Benchmark. It covers many JPA ORM providers (Hibernate, EclipseLink, OpenJPA and DataNucleus) and DBMS (MySQL, PostgreSQL, Derby, HSQLDB, H2, HSQLite, ObjectDB) that are available in Java.

JOnAS stats

Hi, The timing figures of 300 to 900ms are repeated twice, what was the timings for JOnAS and GlassFish ? I found that you need to watch what you do with the transaction attributes with EJB 2.x on JOnAS, as I found, in my experience with JOnAS was that the default behavour is wrap each entity bean field access in a transaction and separate database call. This causes an order of magnitude higher database access that really required. Set the transaction attribute to 'required', in the session bean, even if you're only reading, will actually go much faster in this instance. Cheers T.

TX actually was set to 'required' in the SLSB

It was not just JOnAS' private behaviour to wrap each invocation by a new TX each time, it is what EJB 2.x mandates every application must do AFAIK (as it is just the most intuitive approach to map invocations to SLSBs to transactions). Nevertheless, the JOnAS timing in fact was measured with TX already set to 'required'. Actually the new technology stack (EJB 3 on GlassFish) is just much, much faster than the old stack (EJB 2.1 on JOnAS). Note that the intention of the timing test was to learn exactly the difference between these two stacks, so it says nothing about other combinations, like EJB 3 on JOnAS 5 or EJB 2 on GlassFish, which both had been out of scope of our test.

Unfortunately the blogging system doesn't clearly tell that the article was written years back (January 2010 is just the time when we migrated to java.net blogging system and moved all existing articles). Meanwhile we have measured the actual live values of several customers that moved to the new stack, and all of them confirmed that the new stack is times faster than the old stack.

It would be interesting to see how EJB 3 + GFv2 compares to the latest JPA 2 + GFv3 stack. Unfortunately I meanwhile lost the VM, so I would have to rebuild the benchmark, for which I don't find the time currently.

some points check

Hi there, some points that can help you with the next round:

  • ID generation has no meaning for benchmarking JPA performance
  • Throughput of number of rows in a single call is neither significative, try concurrent accesses with multiple READ/WRITE operations, row update and read. Simulate concurrent users is more important than payload per access
  • You can configure the locking in the database per table. That's gives you a difference when you just drop locking in tables without concurrent WRITE access
  • Use the length attribute in String solumns and always try to reduce the number of wasted bytes per column.
  • Never use long as primary key
  • Prefer to use Collection instead of List or Set in relationships. Using collection gives the JPA implementation a chance to choose the best concrete implementation of the collection. Otherwise the JPA implementation will be forced to convert from the default list format to the one you forced in the annotation
  • Use Java concurrency in some queries that need iteration (you can shard and do other tricks splitting queries in more than one thread)
  • Use Glassfish, it seems faster than JOnAS
  • configure the JVM for performance, allocating more memory when you launch the server

the list should be much longer, but that's the obvious points I could remind right now.........

 

keep trying, it is always excellent to have some one hammering the frameworks.. eventually you will bring us some news o hints on how to use JPA better.

 

End User's View

The benchmark was done from the view of an end user: We wanted to see what happens in the end, when doing those things that we do each day. While this might result in "non-scientific" results, it actually shows best what we have to expect in the real world later. For example, as we noticed in the real world that there are differences e. g. in performance of SEQUENCES compared to IDENTITY based, we thought it is a good idea to benchmark that etc. We in fact do not want to know the exact value of one specific piece of code in a very isolated manner, but we like to know the actual end-to-end-performance of a complete code taken out of our real application. Certainly this is a very diffuse result, but we noticed that it actually is comparable to the performance monitored by customers.