|
|
||
Scott Oaks's BlogMarch 2007 Archivesab considered harmfulPosted by sdo on March 23, 2007 at 03:09 PM | Permalink | Comments (2)To be fair, glassfish does have some out-of-the-box settings that make its benchmark test results less than ideal. Jeanfrancois has this excellent blog that describes the basic settings you need to change before even beginning to do serious performance analysis. I'm hopeful that we'll have better profiles by the time FCS runs around so that a performance-based profile is easily available to end users. [There are some conflicts between optimal settings for developers and production, which is one cause of our problem here, not to mention some historical baggage we have for backward-compatibilty. But that's a topic for another day.] But once you have a reasonably configured appserver, ab is still not the best tool to use to measure your performance. The biggest problem is that ab is a single-threaded process, and you're typically interested in measuring the performance of your multi-CPU machine running the multi-threaded appserver. You can (I hope) see the inherent problem: you have 1 CPU of client-side resources and, say, 4 CPUs of server-side resources. Which side will become the bottleneck first? The client side -- meaning all you've accomplished is measuring the performance of ab itself. This all depends on what you're measuring, of course. Lately, using ab to measure the retrieval of a single static image seems to be all the rage, and this is the worst possible test. Let's say that it takes the appserver 50% longer to process the request for http://host/foo.gif than it takes for ab to send the request and parse the response to make sure it came back correctly (and drain the socket of all the data). Even that is unrealistic, but what it means is that you'll end up using 1.5 CPUs on your appserver by the time your client gets saturated. Nothing you do to the appserver will make this better; the bottleneck is ab. So now you're thinking: what if I have multiple CPUs on my client and I use that -c option to ab: the option that's supposed to send "concurrent" requests. Won't that scale? Unfortuantely not, because the "concurrent" requests are still processed sequentially by ab. ab has only a single thread available to it, so all it does is send multiple requests (one after the other), read any responses that have been sent back (still only one at a time), send any new requests, and so on. It is still limited to utilizing at most a single CPU. And what of the timings you get out of this? The single ab thread sends a request at time 0. Then if it has other responses to process, it will do so. Say there are 10 more reponses to process (which means draining the socket of data, and sending the next request on the socket), and then say ab takes 10 milliseconds for each request. Only then will it again look for a response to the original request. If the response to the original request is waiting for ab, ab will report that it took 110 milliseconds for that request to be processed. But that's only because ab itself spend 100 milliseconds handling other details; it has erroneously charged all of that time it spends sequentially processing data to the pending response. Client-side overhead in any load-generating tool is a problem, but the sequential design of ab makes the problem much worse in ab than in other load generators. Finally, what about those responses? If you run ab -c 100, there are 100 channels open to the server, and ab will report how much throughput comes through those 100 channels. But it won't tell you anything about fairness: 100 responses could come from one channel, or 1 response could come from each channel, and ab will give you the same answer. In fact, given its sequential design, an application server that responds unfairly to requests will show better response times in ab than an application server that responds to requests fairly. But somehow, I don't think the actual users of the first application server will be all too happy (well, one of them will be quite happy indeed!). Are there alternatives to ab? I'm quite happy with faban, an open-source benchmarking toolkit developed by some of my colleagues. It is multi-threaded, can access arbitrary URLs, and measures fairness among other things. It is trickier to set up than ab, though in a future blog I'll explore how it can be used as an ab alternative. Until then, if someone offers you ab, just say no. | ||
|
|