 |
Tricks and Tips with NIO part VI: Heap, View or Direct ByteBuffer, which one perform the best?
Posted by jfarcand on January 23, 2007 at 11:32 AM | Comments (10)
Choosing the right byte buffer is not simple. Correct me if I'm wrong, but there is little documentation about which type of byte buffer to choose when writing scalable server. So here is some informal observations I've made when using the Grizzly WebServer
There is three types of byte buffer:
- Direct Byte Buffer [ByteBuffer.allocateDirect()]: Given a direct byte buffer, the Java virtual machine will make a best effort to perform native I/O operations directly upon it.
- Heap Byte Buffer [ByteBuffer.allocate()]: A byte buffer backed by a byte array.
- View Byte Buffer [ByteBuffer.slice()]: a byte buffer whose content is a shared subsequence of direct or heap byte buffer's content.
Since no data/benchmarks where available on the topic at the time we've written Grizzly, the type is easily configurable via system properties. For the view byte buffer, Grizzly creates a very large one and slice it like this: 68 public synchronized static ByteBuffer allocateView(int size, boolean direct){
69 if (byteBuffer == null ||
70 (byteBuffer.capacity() - byteBuffer.limit() < size)){
71 if ( direct )
72 byteBuffer = ByteBuffer.allocateDirect(capacity);
73 else
74 byteBuffer = ByteBuffer.allocate(capacity);
75 }
76
77 byteBuffer.limit(byteBuffer.position() + size);
78 ByteBuffer view = byteBuffer.slice();
79 byteBuffer.position(byteBuffer.limit());
80
81 return view;
82 }
I didn't run any micro benchmark, but here is some simple results using Grizzly WebServer.I didn't set any special VM config, just the Grizzly out of the box configuration. For stressing the server, I've used ab, which is not the best/viable tool for benchmarking but the easiest to use. The JDK version used is: java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Client VM (build 1.6.0-b105, mixed mode, sharing)
I've run 100 times the following command and calculated the means: % ab -q -k -n1000 -c600
and got the following result. Direct ByteBuffer: 2241 tx/s with the following config:
% java -Dcom.sun.enterprise.web.connector.grizzly.useDirectByteBuffer=true
-jar grizzly-framework.jar 8080 /s1/domains/domain1/docroot/
Jan 23, 2007 12:10:58 PM
INFO:
Grizzly configuration for port 8080
maxThreads: 5
minThreads: 5
ByteBuffer size: 8192
useDirectByteBuffer: true
useByteBufferView: false
maxHttpHeaderSize: 8192
maxKeepAliveRequests: 256
keepAliveTimeoutInSeconds: 30
Heap ByteBuffer: 2269 tx/s with the following config:
% java -Dcom.sun.enterprise.web.connector.grizzly.useDirectByteBuffer=false
-jar grizzly-framework.jar 8080 /s1/domains/domain1/docroot/
Jan 23, 2007 12:19:34 PM
INFO:
Grizzly configuration for port 8080
maxThreads: 5
minThreads: 5
ByteBuffer size: 8192
useDirectByteBuffer: false
useByteBufferView: false
maxHttpHeaderSize: 8192
maxKeepAliveRequests: 256
keepAliveTimeoutInSeconds: 30
View Direct ByteBuffer: 2304 tx/s with the following config:
% java -Dcom.sun.enterprise.web.connector.grizzly.useByteBufferView=true
-Dcom.sun.enterprise.web.connector.grizzly.useDirectByteBuffer=true
-jar grizzly-framework.jar 8080 /s1/domains/domain1/docroot/
Jan 23, 2007 12:31:01 PM
INFO:
Grizzly configuration for port 8080
maxThreads: 5
minThreads: 5
ByteBuffer size: 8192
useDirectByteBuffer: true
useByteBufferView: true
maxHttpHeaderSize: 8192
maxKeepAliveRequests: 256
keepAliveTimeoutInSeconds: 30
View Heap ByteBuffer: 2484 tx/s with the following config:
% java -Dcom.sun.enterprise.web.connector.grizzly.useByteBufferView=true
-Dcom.sun.enterprise.web.connector.grizzly.useDirectByteBuffer=false
-jar grizzly-framework.jar 8080 /s1/domains/domain1/docroot/
Jan 23, 2007 12:38:43 PM
INFO:
Grizzly configuration for port 8080
maxThreads: 5
minThreads: 5
ByteBuffer size: 8192
useDirectByteBuffer: false
useByteBufferView: true
maxHttpHeaderSize: 8192
maxKeepAliveRequests: 256
keepAliveTimeoutInSeconds: 30
Surprisingly, the views from an heap byte buffer are always performing better. But before drawing conclusion, I also did similar test using Apache JMeter and got the same kind of results. Hence I'm tempted to conclude that Heap Byte Buffer will always perform better, and views from heap byte buffer is the type to use when possible. As usual, it might be Grizzly specific numbers, so I recommend you test the various type before making a choice....but I can bet view byte buffer will always perform the best! Of course, if someone has time to write a micro benchmark, make sure you drop your results here.
technorati: grizzly nio glassfish
Bookmark blog post: del.icio.us Digg DZone Furl Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment
-
hi there,
Is "Grizzly Web Server" available for download, like, say, a tomcat download ?
thank you,
BR,
~A
Posted by: anjanb2 on January 23, 2007 at 02:10 PM
-
Hi, yes you can download it from the Maven repository
-- Jeanfrancois
Posted by: jfarcand on January 23, 2007 at 02:14 PM
-
hi Jean,
Thank you.
I got the web server working(serving static files).
Now, how do I use it as an servlet/jsp Appserver (like Tomcat) ?
Thank you,
BR,
~A
Posted by: anjanb2 on January 23, 2007 at 03:24 PM
-
hi Jean,
I noticed that after the foreground process was terminated, port 8080 was still open (tomcat complained when I launched it) and found that there were 2 headless javaw processes.
I called jstack on it and found that these 2 processes were running for a while. I'm wondering if this is by design ?
Thank you,
BR,
~A
Posted by: anjanb2 on January 23, 2007 at 04:08 PM
-
Hi,
For Servlet/JSP, you need to use GlassFish. Grizzly WebServer only support static files. For the process, it probably means it wasn't closed properly. -- Jeanfrancois
Posted by: jfarcand on January 23, 2007 at 06:15 PM
-
Hi Jean-Francois,
for a comparison - which numbers do you get when you simply do ByteBuffer.allocate(size) without the "synchronized" and without re-use? The pressure on the GC would be interesting.
Thanks, Matthias
Posted by: mernst on January 24, 2007 at 12:01 AM
-
Hi Matthias, let me try this configuration. I agree the GC will have fun.... Stay tuned. -- Jeanfrancois
Posted by: jfarcand on January 26, 2007 at 11:41 AM
-
I did a few tests with my web proxy (rabbit), running apachebench (-n100000 -c500 ...) on a cached resource and tested with both direct and heap buffers.
In rabbit I reuse the buffers whenever I can. This means that buffers live long and are almost never GC:ed.
The results I saw was that the speed was very similar with direct buffers winning out by a few requests on the best run. A few meaning less than 50 out of about 2500. I did not run enough benchmarks to be certain of the outcome.
Rabbit usually uses 4 kB buffers and it does not slice them. The buffers may grow to 128 kB, but I do not think that happen in this test.
This is on linux amd/64 bit system. I am not sure how much that means for direct buffers though.
Posted by: ernimril on January 26, 2007 at 01:47 PM
-
Thanks for the feedback. Are you using -server or -client?
Posted by: jfarcand on January 30, 2007 at 07:41 AM
-
Since I use an Amd, 64 bit, dual core system with 2 GB or ram I can only use the server version.
Posted by: ernimril on January 30, 2007 at 11:32 AM
|