The Source for Java Technology Collaboration
User: Password:



Scott Oaks's Blog

A Glassfish Tuning Primer

Posted by sdo on December 03, 2007 at 11:25 AM | Comments (1)

When I reported our recent excellent SPECjAppServer 2004 scores, one glassfish user responded:
I sure wish you guys were able to come up with a thorough write up
about the SPEC Benchmark architecture, and the techniques you guys
used to get the numbers you get and, more importantly, how those
techniques might apply to every day applications we run in the wild.
While we do have a full performance-tuning chapter in the glassfish/SJSAS docset, I can understand the appeal of a quick cheat-sheet for getting the most out of glassfish in production. Most of this information has appeared in various blogs, particularly by Jeanfrancois, who is so expertly focused on making sure that grizzly and our http path is as fast as possible. Still, I hope that gathering this quick list together will be a good single-source summary.

One thing to note about these guidlines: a lot of glassfish configurations (particularly when you start with a developer profile) are optimized for developers. In development, performance is different: you'll trade off a few seconds here and there to make starting the appserver faster, or deploying something faster. In production, you'll make opposite trade-offs. So if you wonder why some of the things in this list aren't necessarily the default setting, that's probably why.

Tune your JVM

The first step is to tune the JVM, which is of course different for every deployment. These are the options set via the jvm-option tag in your domain.xml (or the JVM options page in the admin console). As a general rule, I like to use the throughput collector with large heaps and a moderate-sized young generations: that makes young GCs quite fast. That will lead to a periodic full GC, but the impact of that on total throughput is usually quite minimal. If you absolutely cannot tolerate a pause of a few seconds, you can look at the concurrent collector, but be aware that this will impact your total throughput. So a good set of JVM arguments to start with are:
-server -Xmx3500m -Xms3500m -Xmn1500m -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:+AggressiveOpts
On a CMT machine like the SunFire T5220 server, you'll want to use large pages of 256m, and a heap that is a multiple of that:
-server -XX:LargePageSizeInBytes=256m -Xmx2560m -Xms2560m -Xmn1024m -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:ParallelGCThreads=16 -XX:+AggressiveOpts
More details of the impact of a CMT machine are available at Sun's Cool Threads website.

Make sure to remove the -client option from your jvm options, to include the -Dcom.sun.enterprise.server.ss.ASQuickStartup=false flag, and -- if you are using CMP 2.1 entity beans -- to include -DAllowMediatedWriteInDefaultFetchGroup=true.

Tune the default-web.xml

Settings in the default-web.xml file are overridden by an application's web.xml, but I find it easier to set production-ready values in the default-web.xml file so that all applications will get them. In particular, under the JspServlet definition, add these two parameters:
<init-param>
  <param-name>development</param-name>
  <param-value>false</param-value>
</init-param>
<init-param>
  <param-name>genStrAsCharArray</param-name>
  <param-value>true</param-value>
</init-param>
That will mean you cannot change JSP pages on your production server without redeploying the application, but that's generally what you want anyway.

On note about this: this file is only consulted when an application is deployed. So make sure you change the file and then deploy your application, or you won't see any benefit from this change.

Tune the HTTP threads

As you know, there are two parameters here: the HTTP acceptor threads, and the request-processing threads. These value have unfortunately had different meanings in a few of our releases, and some confusion about them remains. The acceptor threads are used to both to accept new connections to the server and to schedule existing connections when a new request comes over them. In general, you'll need 1 of these for every 1-4 cores on your machine; no more than that (unlike, say SJSAS 8.1 where this had a completely different meaning). The request threads run HTTP requests. You want "just enough" of those: enough to keep the machine busy, but not so many that they compete for CPU resources -- if they compete for CPU resources, then your throughput will suffer greatly. Too many request processing threads is often a big performance problem I see on many machines.

How many is "just enough"? It depends, of course -- in a case where HTTP requests don't use any external resource and are hence CPU bound, you want only as many HTTP request processing threads as you have CPUs on the machine. But if the HTTP request makes a database call (even indirectly, like by using a JPA entity), the request will block while waiting for the database, and you could profitably run another thread. So this takes some trial and error, but start with the same number of threads as you have CPU and increase them until you no longer see an improvement in throughput.

Tune your JDBC drivers

Speaking of databases, it's quite important in glassfish to use JDBC drivers that perform statement caching; this allows the appserver to reuse prepared statements and is a huge performance win. The JDBC drivers that come bundled with the Sun Java Systems Application Server provide such caching; Oracle's standard JDBC drivers do as well, as do recent drivers for Postgres and MySQL. Whichever driver you use, make sure to configure the properties to use statement caching when you set up the JDBC connection pool -- e.g., for Oracle's JDBC drivers, include the properties
ImplicitCachingEnabled=true
MaxStatements=200

Use the HTTP file cache

If you serve a lot of static content, make sure to enable the HTTP file cache.



Have I piqued your interest? As I mentioned, there are hundreds of pages of tuning guidelines in our docset. But here at least you have some important first steps.

Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • I was the one that asked for that detail on the SPEC benchmark. This doc set is excellent. Obviously, I haven't read it in detail (it's large, as you said), but the presentation is nice is describing not just what the parameters are, but why you would want to change (though not in mind numbing detail).

    Thankfully, my apps simply aren't hammered enough to where I've had to make a sweep of tweaking bits to make them perform. They run "fast enough" out of the box.

    But as we make the application development process easier through adding in abstractions (such as running our code within a container over and above the standard OS and Java runtime), then the more these abstractions influence overall system behavior.

    With the complexity of the modern containers, and the depth of their interactions, when it come to performance it's easy to view the containers behavior as "dark matter". Some undetectable friction between you application and your users. The kind of thing where you say "my code is fast, my DB is fast, yet the app is slow -- what's left?".

    The benefit of documenting all of these knobs and levers that can be used to tune the server is that they make aware to developers elements of the application server that they may simply be completely unaware of.

    After all, the container is there to hide the arcane dirty work of getting network aware, multi-threaded applications interfacing with disparate components. But the reality of what the applications server do does not go away. The app servers embrace and shroud development and operational complexity, but that complexity is still there. It doesn't "go away".

    Just because I can create a session bean with a simple annotation does mean that session beans are simple.

    I brought up the topic of such a document as this up with regards to the SPEC mark because you folks doing the SPEC work have direct brain picking access to the core developers and designers. For example, I look at Jean Francois' various blog posts, and his solutions for arcane edge case issues and bugs tend to be "add -DhiddenParameter=Q and -DwhereDidThisComeFrom=Z to the start up parameters" type of solutions.

    When doing the SPEC marks, you would ideally encounter all of the "10 tents" percentile edge cases to eek out the final JOPS from the benchmark, and you'd have Jean Francois on the phone helping you out.

    But obviously, we don't quite have that access (note this isn't a negative criticism, it's just an observation -- Jean Francois, as well as everyone else on the GF team are really open and helpful).

    So it's nice that you've managed to perhaps squeeze Jean Francois and the others on to some paper and capture what can only otherwise perhaps be described as folklore on to a reasonably comprehensive single document set.

    Posted by: whartung on December 03, 2007 at 01:56 PM



Only logged in users may post comments. Login Here.


Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds