Fun With JStack
Avid readers of the glassfish aliases know that we are frequently asked questions about why their server isn't responding, or why it is slow, or how many requests are being worked on. And the first thing we always say is to look at the jstack output.
So you're running on a Sun 5120 with 128 hardware threads and so you have 368 request processing threads and the jstack output is 10K lines. Now what? What I do is use the program attached to this blog entry -- it will parse the jstack output and show how many threads are doing what. At it's simplest, it's something like this:
% jstack pid > jstack.out
% java ParseJStack jstack.out
Threads in start Running
8 threads in java.lang.Throwable.getStackTraceElement(Native
Total Running Threads: 8
Threads in state Blocked by Locks
41 threads running in com.sun.enterprise.loader.EJBClassLoader.getResourceAsStream(EJBClassLoader.java:801)
Total Blocked by Locks Threads: 41
Threads in state Waiting for notify
39 threads running in com.sun.enterprise.web.connector.grizzly.LinkedListPipeline.getTask(LinkedListPipeline.java:294)
18 threads running in System Thread
Total Waiting for notify Threads: 74
Threads in state Waiting for I/O read
14 threads running in com.acme.MyServlet.doGet(MyServlet.java:603)
Total Waiting for I/O read Threads: 14
The parser has aggregated all the threads and shown how they are in various states. 8 threads are current on the CPU (they happen to be doing a stack trace -- a quite expensive operation which is better to avoid). That's fine -- but we probably want it to be more than that.
41 threads are blocked by a lock. The summary method shown is the first non-JDK method in the stack trace; in this case it happens to be the glassfish EJBClassLoader.getResourceAsStream. Now we need to go actually look at the stack trace and search for that class/method and see what resource the thread is blocked on. In this example, all the threads were blocked waiting to read the same Zip file (really a Jar file), and the stack trace for those threads show that all the calls came from instantiating a new SAX Parser. If you didn't know, the SAX parser used by a particular application can be defined dynamically by listing the resource in the manifest file of the applications jar files, which means that the JDK must search the entire class path for those entries until if finds the one the application wants to use (or until it doesn't find anything and falls back to the system parser). But since reading the jar file requires a synchronization lock, all those threads trying to create a parser end up contending for the same lock, which is greatly hampering our application's throughput. It happens that you can set a system property to define the parser and hence avoid the dynamic lookup all the time (-Djavax.xml.parsers.SAXParserFactory=com.sun.org.apache.xerces.internal.jaxp.SaxParserFactoryImpl will always default to the JDK parser).
But the larger point here is that when you see lots of threads blocked on a resource, that's a problem that is throttling your throughput ,and hence whatever the resource is, you need to make changes to your configuration or application to avoid that contention.
What about the threads in notify? Those threads are waiting to be woken up. Usually they are threads in a pool waiting for notification that a task is ready (e.g. the getTask method is grizzly threads that are waiting for a request). System threads are doing things like RMI distributed GC, or JMX monitoring -- you'll see them in the jstack output as threads that have only JDK classes in their stack.
But another problem creeps up in the threads waiting for I/O read -- these are threads that are doing a blocking I/O call (usually socketRead0 -- not accept, which will show up in the jstack parsing as waiting for I/O accept). This is something else that is hampering your throughput -- the thread is waiting for a backend resource to answer its request. Most often this is a database -- again, you can look at the full stack in the output to see exactly what the call is. Maybe your app is making a the same call to a database; if that keeps showing up, its time to optimize the SQL or tune the database so that call happens faster.
There are some other states in the output I haven't show -- threads doing accept, or sleeping, or GC threads. But 95% of the time, if you aren't getting the throughput of out your server that you expect, its because the threads are blocked on I/O read or on a lock.
There are two other things you can do with this tool: first is that you can list multiple file names on the command line and the tool will aggregate them. So sometimes if you take a bunch of jstacks in a row, you can parse all of them and get a better idea where you are blocked. And second you can use the -name argument to limit the output only to certain threads. For example, in glassfish the grizzly http-request-processing threads are named httpSSLWorkerThread-
Finally, two caveats: the first is that jstack output is likely to change, and the parser will not be robust for all versions of the JDK (nor necessarily all JDK options; some GC options may dump GC thread output the tool is unprepared for). So it will likely need to be tweeked if you encounter a parsing error. Second is that a jstack is only a snapshot in time -- it is like a sampling profiler but with a very, very large interval. That means that sampling artifacts will affect your analysis, just like a profiler may point you to a parent or child class if the sampling rate is off a little bit. The fact that the JDK pauses threads only at certain locations to get their stack also affects that, particularly for running threads -- so a method that continually shows up in the running state doesn't mean that method is particularly expensive, just that it is a good point for the JDK to get the trace. [It may be expensive as well, though -- we know from other analysis that working with stack traces, including throwing exceptions, is very expensive in glassfish due to its very deep stack depth, so it's not surprising in my example above to see lots of threads running in the getStackTraceElement method). So this parsing is much more effective at finding things that are blocking your throughput than at finding runtime bottlenecks where you are using too much CPU.
Like all performance tools, jstack gives you one piece of information -- hopefully this will let you place that one piece of information into context with information from your other tools when it comes time to diagnose throughput bottlenecks.