 |
Tricks and Tips with NIO part II: Why SelectionKey.attach() is evil
Posted by jfarcand on June 06, 2006 at 10:10 AM | Comments (18)
First, thanks for all the good feedback on part I. Please use the thread instead of sending me private email, as everybody can contribute to the answer (and I will not forgot to respond :-) ). Now If I can have the same kind of feedback for Grizzly code, I will be very happy (subliminal marketing here :-) )
OK this time I want to discuss the java.nio.channels.SelectionKey.attach(). I recommend you read about SelectionKey and the way you handle them before reading this blog. As usual, my first try might contains unclear parts (I should really start blogging in French instead :-) ).
The Java documentation for SelectionKey.attach() states:
Attaches the given object to this key.
An attached object may later be retrieved via the attachment method. Only one
object may be attached at a time; invoking this method causes any previous
attachment to be discarded. The current attachment may be discarded by
attaching null.
Wow...the devil exists, and he lives inside the NIO API!
Why? Well, let takes a simple example. Usually, you handle SelectionKey by doing:
selectorState = 0;
enableSelectionKeys();
try{
selectorState = selector.select(selectorTimeout);
} catch (CancelledKeyException ex){
;
}
readyKeys = selector.selectedKeys();
iterator = readyKeys.iterator();
while (iterator.hasNext()) {
key = iterator.next();
iterator.remove();
if (key.isValid()) {
handleConnection(key);
} else {
cancelKey(key);
}
}
The Selector.select() will always return the set of SelectionKey whose ready-operation sets were updated. Then the handleConnection implementation will most likely looks like:
if ((key.readyOps() & SelectionKey.OP_ACCEPT) ==
SelectionKey.OP_ACCEPT){
handleAccept(key);
} else if ((key.readyOps() & SelectionKey.OP_READ) ==
SelectionKey.OP_READ) {
handleRead(key);
}
Next in handleRead(key), you will do:
socketChannel = (SocketChannel)key.channel();
while ( socketChannel.isOpen() &&
(count = socketChannel.read(byteBuffer))> -1)){
// do something
}
Well, the scary part is the // do something.
Gold Candidate for a Memory Leak (GCML)
At this stage, socketChannel is ready to read bytes. Hence you invoke socketChannel.read(byteBuffer), and you find that you haven't read all the bytes from the socket (or you are ready to handle the next request), so you decide to register the SelectionKey back to the Selector by doing:
selectionKey.interestOps(
selectionKey.interestOps() | SelectionKey.OP_READ);
and...and...and do something like:
selectionKey.attach(...)
Boum...the little ... is where the devil is hiding! What you are attaching to the SelectionKey is very dangerous, because there is some probability that your SelectionKey might never return to a ready-operation state, leaving the SelectionKey and its evil attachment forever inside the Selector keys set. Does it sound like a GC...ML (GCML)? But what's the point, nobody will ever do that, because we are all very talented engineers, and we always take care of cleaning our SelectionKey from the Selector keys set, right?
The problem might comes when your framework needs to handle thousand of connections, and you need to keep-alive those connections for a very long time (from 60 seconds to 5 minutes). Most framework (and unfortunately a lot of tutorials and talks) will usually attach their framework object to the SelectionKey (ex: The Reactor pattern). Those framework objects will most probably include:
- A ByteBuffer
- Some keep-alive object (let's assume a Long)
- A SocketChannel
- A Framework Handler (like the Reactor pattern)
- etc.
So you can ends up with thousand of objects taking vacations, enjoying idle time inside the Selector keys set. If you didn't implement any mechanism to make periodical look inside the Selector keys set, then you will most probably ends up with a memory leak (or your framework performance will be impacted). Worse, you might never notice the problem....
Recommended solutions?
My first experimentation of NIO was using that kind of approach (like Mina, like EmberIO and like our ORB NIO implementation). Then Scott started working with me on Grizzly and pointed the problems after a couple of benchmarks. Under HTTP stress, you might ends up with 10 000 connections (so 10 000 actives SelectionKey). If they all have as an attachment a ByteBuffer or an Handler, then a lot of memory will be consumed, reducing your scalability and having fun eating all your memory.
Even if most Virtual Machine are very good those days, I consider this as a very bad design anyway, unless you have a very good reason. But don't get me wrong, I'm not saying the framework listed above are bad, I'm just pointing some problems.
The next couple of paragraphs will describe some solutions
How do I retrieve the SocketChannel if I don't attach it to my framework object.
Most existing framework include, inside their framework object, the SocketChannel associated with the SelectionKey. This is wrong, because the SocketChannel can always be retrieved using SelectionKey.channel().
How do I deal with incomplete socketChannel read.
When you do socketChannel.read(), you can never predict when all bytes are read from the socket buffer. Most of the time, you will have to register the SelectionKey back to the Selector, waiting for more bytes to be available. In that case, you will most probably attach the incomplete ByteBuffer to the SelectionKey, and continue adding bytes to it once the SelectionKey is ready. Instead, I would recommend you register the SelectionKey to a temporary Selector (I will blog about this trick in more details):
try{
SocketChannel socketChannel = (SocketChannel)key.channel();
while (count > 0){
count = socketChannel.read(byteBuffer);
}
if ( byteRead == 0 ){
readSelector = SelectorFactory.getSelector();
tmpKey = socketChannel
.register(readSelector,SelectionKey.OP_READ);
tmpKey.interestOps(tmpKey.interestOps() | SelectionKey.OP_READ);
int code = readSelector.select(readTimeout);
tmpKey.interestOps(
tmpKey.interestOps() & (~SelectionKey.OP_READ));
if ( code == 0 ){
return 0;
}
while (count > 0){
count = socketChannel.read(byteBuffer);
}
}
} catch (Throwable t){
In this example, you try to read more bytes using a temporay Selector (on the same Thread, without having to return to your original Selector, which most of the time run on another thread). With this trick, you don't need to attach anything to the SelectionKey.
But there is a drawback. If the temporary Selector.select() blocks (because the SelectionKey isn't ready, most probably because the client isn't sending all the bytes), you will block a processing Thread for "readTimeout" seconds, ending up in a similar well known situation called blocking socket ;-) (one thread per connection). That wouldn't have been the case if you had registered the SelectionKey back the original Selector with a ByteBuffer attached. So here you gonna need to decide based on your use of NIO: do you want a dormant ByteBuffer attached to a SelectionKey or a Thread blocking for readTimeout.
In Grizzly, both approaches can be configured, but by default the thread will block for 15 seconds and cancel the SelectionKey if the client isn't doing anything. You can configure Grizzly to attach the ByteBuffer if you really like to use memory :-) . We did try on slow network, with broken client, etc., and blocking a Thread scale better than having a dormant ByteBuffer,
Have one ByteBuffer per Thread, not per object framework.
The good news about not using SelectionKey.attach() is you only need to create a ByteBuffer per Thread, instead of a ByteBuffer per SelectionKey. So for 10 000 connections, instead of having 10 000 ByteBuffer, you will only have X ByteBuffer, where X = the number of active Threads. This significantly improve scalability by not overloading the VM with dormant ByteBuffer. As an example, in Grizzly, the keep-alive mechanism is implemented as follow:
- Thread-1: Selector.select()
- Thread-2: Do socketChannel.read() until the HTTP 1.1 request is fully read
- Thread-2: Process the request and send the response
- Thread-2: register the SelectionKey back to the Selector (without SelectionKey.attach())
- Thread-1: Selector.select(). If the SelectionKey ready, then
- Thread-2: Do socketChannel.read() until the HTTP 1.1 request is fully read
- etc.
As you can see, Thread-2 is not blocked between keep-alive requests. Theorically, you would probably be able to serve hundreds of requests with only two threads. Not to say no dormant ByteBuffer, no pending framework objects, etc.
Wow that one was very long. Agree, disagree?.....
Next time I will discuss when its appropriate to spawn a thread to handle OP_ACCEPT, OP_READ and OP_WRITE, and when its not.
technorati: grizzly nio glassfish
Bookmark blog post: del.icio.us Digg DZone Furl Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment
-
Hmmm, you bring up some good points, Jean-Francois. Looking into my own code I see that I'm susceptible to this problem. I'l consider your suggestions.
By the way, how tightly is Grizzly oriented towards the HTTP protocol? Could parts of Grizzly be used for another protocol? -Michel
Posted by: michelsantos on June 06, 2006 at 05:03 PM
-
Hi Michel,
Grizzly is not tight to the HTTP protocol. If you look at [1], you can easily write your own StreamAlgorithm implementation that doesn't implement any HTTP logic. Internaly, some project are using Grizzly for tcp only operations.
Thanks!
-- Jeanfrancois
[1] http://weblogs.java.net/blog/jfarcand/archive/2006/01/introduction_to.html
Posted by: jfarcand on June 06, 2006 at 05:50 PM
-
That does sound interesting. I would like to download the code for the Web Tier to take a closer look. I see from your link above how to view and download individual files. However, I would like to download all of the Web Tier code together.
I tried the main page for the web tier.
Its Download page points to Glassfish Download page.
There, the source code bundles link points to the Glassfish Server Modules page.
The link for Web Tier points back to Step 1.
Should I be downloading the Nightly Builds for all of Glassfish instead?
Posted by: michelsantos on June 07, 2006 at 02:29 AM
-
For the binaries, yes the nightly build is fine. If you want to download the source, you only need to download modules:
appserv-webtier
appserv-http-engine
Just do cvs -d @cvc.dev.java.net:/cvs glassfish/>
Posted by: jfarcand on June 07, 2006 at 07:21 AM
-
Jean-Francois,
Do socketChannel.read() until the HTTP 1.1 request is fully read
I'm not disagreeing that this might perform well, but I'm surprised it works, I found this to be impossible. "Read until request is fully read" doesn't work. When 0 is returned it doesn't mean you've read the full request, it just means you've drained the channel...the client may still have more bytes to write.
So we've just read "HTT"...and the thread goes off to read from a different channel, how do you piece the multiplexed IO back together again? That's where attach() comes in...I cannot figure out how you've solved that problem.
Taylor
Posted by: tcowan on June 07, 2006 at 08:43 PM
-
Jean-Francois, Thanks for your reply. It worked.
Is this documented somewhere? -Michel
Posted by: michelsantos on June 08, 2006 at 08:18 AM
-
Taylor - The way Grizzly does it is by wrapping a ByteBuffer using a ByteBufferInputStream. When the ByteBufferInputStream.read(..) is called, under the hood sochetChannel.read(byteBuffer) is called. If it return 0, instead of returning to the main Selector (and do SelectionKey.attach(ByteBuffer), the class use a temporary pool of Selector to register the channel on it. Then the next socketChannel.read(byteBuffer) will be exectuted by the temporary Selector, not the main Selector. Just take a look at the full implementation here. It works pretty well :-). -- Jeanfrancois
Posted by: jfarcand on June 08, 2006 at 01:44 PM
-
Michel - The instruction on how to build Grizzly/GlassFish are here. -- Jeanfrancois
Posted by: jfarcand on June 08, 2006 at 01:45 PM
-
Thank you, Jean-Francois.
Posted by: michelsantos on June 08, 2006 at 03:27 PM
-
Hi Jean-François
Your blog is really interesting. I am very impressed by what you do.
Paul
Posted by: paulgreen on June 12, 2006 at 07:06 PM
-
Hi Jean-François, I have seen in a few of your related articles the method call interestOps(0). What does this mean? that you are temporarily not interested in selecting on that key ?
Posted by: jacasey on June 28, 2006 at 08:18 PM
-
Hi jacasey,
yes. That means until the processing is completed, I'm not interested to process any new OP_x event. I'm doing this to avoid having several threads working on the same connections.
Posted by: jfarcand on July 06, 2006 at 09:26 AM
-
Sorry if I misunderstood something... But do you really think to have the separate thread per long running read is better then attach the incompleted with current request data ByteBuffer?
If we can read all request data at ones we can pass it into the worker thread for handling and forget about that ByteBuffer (no need to attach it). Otherwise the attachment is cheaper then new thread from the point of consuming resources.
Also keep in mind that we are getting from the cache or allocating the new ByteBuffer every time we need to read new request/packet/etc and use selector thread only for collecting the raw data.
Also I can't understand why you doing so:
while (count > 0){
count = socketChannel.read(byteBuffer);
}
I think this is waste of time because all we have we should read at once (may be most of cases). If data is incompleted and something suddently will appeared right after our first read() we will read it after next call of select() which will be really very soon in that case. But another ready for reading channels will have more fairness to be read too.
Posted by: a_ilyin on September 01, 2006 at 06:02 AM
-
It always depends on what you are doing (Grizzlyu support both approaches). For HTTP, the message header will most likely be read using a couple of socketChannel read. The HTTP parser will try to find the end of the header bytes, and then execute the request. If the parser cannot determine the end of the header, rolling back the byteBuffer (and the state of the http parser) and then attach it to the SelectionKey will gives bad performance, because you will then have to reparse the byteBuffer again. It will perform better if you use a temporary Selector to read the missing bytes on using the same thread. Of course you don't want to block your thread waiting for bytes, but the cost of rolling back the transaction need to be carefully calculated. Like I said, having 10 000 attached ByteBuffer is not good. As for the count > 0, you aren't garantee all bytes will be read even if the count > 0, so looping is always good. Thanks, Jeanfrancois
Posted by: jfarcand on September 01, 2006 at 09:50 AM
-
Jeanfrancois, thank you for answer. I absolutely agree that it always depends on what you are doing :-) Really!
But what is the benefit instead of 10000 attached ByteBuffers to have 10000 waiting threads?
Replacing the attachment by the thread doesn't mean you get rid the need to rollback the parser in case the buffer is not yet full. Instead of attach the ByteBuffer you can attach some object (framework object as you call it before) which will have incapsulated ByteBuffer and current parser state. So no need more to rollback parser!
Also IMHO not bad idea to do only read/write in the selector thread but all the processing in the worker threads. In your case selector thread should only read the bytes into the BB. But the parsing and processing should be doing in the worker thread. So you will have a chance to control your recources. F.g. put the little(too fragmented) requests/packets/queries/etc into the slow queue instead large packets which have a high probability to be parsed at once put in to the fast queue and so on.
As for the count > 0, you aren't garantee all bytes will be read even if the count > 0, so looping is always good
We no need such a garantee. If something was remained then we get OP_READ again and read it on the next cycle. No problem. Of cource if you after the first OP_READ spawn separate thread for reading that channel your practice is reasonable. But such a practice return us back to the old socket problem with OS thread limitations.
So if you read in one "selector thread" then you should do it as fast as possible to guarantee fair channel handling. In this case better to read remaining bytes on the next "select" cycle.
Regards,
Alexander
Posted by: a_ilyin on September 03, 2006 at 03:40 AM
-
Alexandre, you don't necessarely have 10000 waiting threads. In Grizzly, you have one ByteBuffer per Thread, which means you are always low on ByteBuffer. The current parser state will hold some data, right? Like a byte[] or ByteBuffer....this is exactly the problem this blog talk about. You can always attach to the SelectionKey, but you have to make sure your object are small if you care about scalability. Also, doing the read/write on the SelectionThread never gived me (for http) good result. I really think the read and the write needs to execute on their own thread, allowing the main Selector thread to be able to continue accepting requests with blocking for read( (event if they are small). Thanks for the feeback, as usual :-)
Posted by: jfarcand on September 14, 2006 at 11:01 AM
-
Determining if a http request is fully read is not only related to NIO no-blocking API.
I just wondering how does the original blocking socket API deal with this problem.
socket.getInputStream().read(bytes)
will return when "end of file" bits are detected.
If there is an "end of file" marker showing up in a tcp request (I am not familar with detail tcp protocol), why does NIO need a parser to decide the "end of request" ?
Posted by: yuwang881 on November 05, 2006 at 01:14 AM
-
I think your points are primarily related to protocols such as HTTP that do not deal with persistant connections, or at least do so on a limited basis. No matter which way you cut the cookie, frameworks that deal exclusively with persistent connections will need to manage some sort of handler object per connection. Whether you attach that to selection keys or manage them seperately does not significantly change memory usage, and solving the problem with additional threads only serves to add context switching load.
Posted by: remonvv on October 30, 2007 at 05:21 AM
|