The Source for Java Technology Collaboration
User: Password:



Jean-Francois Arcand

Jean-Francois Arcand's Blog

Tricks and Tips with NIO part I: Why you must handle OP_WRITE

Posted by jfarcand on May 30, 2006 at 05:34 PM | Comments (18)

I'm getting a lot of questions about NIO in general, and since most of them apply not only to HTTP handling, I've decided to blog about my experience with NIO in Grizzly. The observations I've measured might not apply to all NIO based servers implementation, but I suspect it will cover the majority of them. Anyway there is not much documentations about NIO in general (except basic tutorial), so it might not hurt to blog about it, whatever I'm right or wrong. But first, I recommend reading an NIO tutorial (if you don't know what NIO is) before reading this blog :-).

When building a scalable NIO server, you always have to handle three important NIO operation set bit:
  • OP_ACCEPT: Operation-set bit for socket-accept operations
  • OP_READ: Operation-set bit for read operations
  • OP_WRITE: Operation-set bit for write operations
Handling OP_ACCEPT and OP_READ has been well documented in several NIO tutorial. Strangely, the OP_WRITE is sometimes not described at all. Not handling the OP_WRITE correctly can make your server performance pretty bad, and on win32, can produce disastrous performance problem, like freezing the OS by eating all the CPU. How come? Well, let start with an example. Usually, you will write to a SocketChannel by doing:


     while ( bb.hasRemaining() ) {
        int len = socketChannel.write(bb);
        if (len < 0){
           throw new EOFException();
        } 
     }

This code will works most of the time....until the Selector on which the SocketChannel has been registered is exhausted, e.g the Selector isn't able to let the socketChannel flush the content of the ByteBuffer. , which means:


     while ( bb.hasRemaining() ) {
        // *** socketChannel will always return 0
        int len = socketChannel.write(bb);       
        if (len < 0){
           throw new EOFException();
        } 
     }

Hence the CPU will be consumed by looping over and over, producing disastrous performance problem (try it in win32 :-)). OK, but what can we do? There is several ways of handling this. In GlassFish, Grizzly uses a pool of temporary Selector to register the SocketChannel on it:


        try {
            while ( bb.hasRemaining() ) {
                int len = socketChannel.write(bb);
                attempts++;
                if (len < 0){
                    throw new EOFException();
                }

                if (len == 0) {
                    if ( writeSelector == null ){
                        writeSelector = SelectorFactory.getSelector();
                        if ( writeSelector == null){
                            // Continue using the main one.
                            continue;
                        }
                    }

                    key = socketChannel.register(writeSelector, key.OP_WRITE);

                    if (writeSelector.select(30 * 1000) == 0) {
                        if (attempts > 2)
                            throw new IOException("Client disconnected");
                    } else {
                        attempts--;
                    }
                } else {
                    attempts = 0;
                }
            }
        } finally {
            if (key != null) {
                key.cancel();
                key = null;
            }

            if ( writeSelector != null ) {
                // Flush the key.
                writeSelector.selectNow();
                SelectorFactory.returnSelector(writeSelector);
            }
        }

If the main Selector is exhausted, a temporary Selector will be used to handle the OP_WRITE. In Grizzly, since the OP_READ also use the pool of Selector, the pool might return null. In that case, the main Selector will be re-used instead, like the Mina framework is doing. The Jetty Web server seems to create a temporary Selector lazily (I don't know Jetty enough to know the life cycle of SocketChannelOutputStream, but I suspect this object is recycled amongst requests so there is not an infinite Selector creation). EmberIO doesn't seems to handle OP_WRITE at all, which is surprising knowing the popularity of this framework.

Another alternative is to use a ThreadLocal to store a temporary Selector. Unfortunately the benchmarks I did demonstrated that this approach is slower that using temporary pool of Selector. 

That's it. Next time I will discuss the evil method SelectionKey.attach() gold candidate for memory leak.

technorati:   
 


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • What do you mean by a Selector to be exhausted? Actually, I don't understand your code examples at all. Where does the selector come into play in your first while loop? Shouldn't you actually do a select on the writeSelector in the last example?

    Posted by: mernst on May 31, 2006 at 05:56 AM

  • I found this to be a nice overview (including OP_WRITE) for anyone who needs an introduction to NIO:
    Scalable IO in Java

    Posted by: esmith on May 31, 2006 at 06:03 AM

  • Thanks mernst! I did the blog late and my last cut and paste was missing the last couple of lines :-) I've updated with all the logic! Thanks!

    > What do you mean by a Selector to be exhausted?

    I means the Selector is not able to let the socketChannel write its buffer. When this happens, the socketChannel.write(bb) will return a value of 0, meaning no bytes were writen.

    Posted by: jfarcand on May 31, 2006 at 07:08 AM

  • That makes more sense :-) Still two questions:

    Listing 1: I don't see how "selector exhaustion" has something to do with socketChannel.write(buffer) returning 0. There is no selector involved in that loop, right?

    Listing 3: When you're using the temporary selector in the writer here you essentially dedicate the current thread to the single connection, just as if you were using the socket in blocking mode with timeout 30s. Shouldn't you rather register an interest in OP_WRITE with the main selector and return this thread to the pool? Or is this an optimization to avoid context switching because you expect the socket to be ready soon enough?

    Thanks Matthias

    Posted by: mernst on May 31, 2006 at 09:22 AM

  • Hi Matthias,
    > here is no selector involved in that loop, right?
    Right, the socketChannel is by default registered with the main Selector (see below).

    > Shouldn't you rather register an interest in OP_WRITE
    with the main selector and return this thread to the pool?

    when the socketChannel is unable to write, I means the main Selector is unable to handle properly the OP_WRITE. Since the main Selector is most probably (at least it is in Grizzly) located on another thread, you don't want to switch back to that thread to finish flusing the response (and keeping the incomplete ByteBuffer attached to the SelectionKey or stored somewhere, waiting for the OP_WRITE to happen). Hence you most probably try to write, and if that doesn't work, then you register your socketChannel to another dedicated Selector. But you are right, you can always do (assuming you have a reference to the SelectionKey and the main Selector)

    while ( bb.hasRemaining() ) {
    int len = socketChannel.write(bb);
    if (len < 0){
    throw new EOFException();
    }

    if (len == 0) {
    selectionKey.interestOps( selectionKey.interestOps() |
    SelectionKey.OP_WRITE);
    mainSelector.wakeup();
    break;
    }
    }But this probably means you have to switch to another thread, persist the ByteBuffer state, etc., which I'm convinced will not perform well.
    > Or is this an optimization to avoid context switching
    because you expect the socket to be ready soon enough?
    Exactly.

    Posted by: jfarcand on May 31, 2006 at 09:58 AM

  • Isn't SocketChannel.register incredibly slow? At least in 1.4.0 days, I found SelectionKey.interestOps many times faster.

    Also, IIRC, SocketChannel.register will block if the selector in question is, say, in select. It's some time ago, but what I think I came up with using a guard lock (I'm not guaranteeing this code is either correct or performant):

    // In thread wanting to change selection key ops.

    synchronized (lock) {
    // Prevent immediate re-entry of select.
    dontSelect = true;
    try {
    if (inSelect) {
    // Make sure will awaken.
    selector.wakeup();
    }
    // Do the deed.
    key = channel.register(selector, SelectionKey.OP_WRITE);
    } finally {
    // Notify selector thread, it can select again.
    dontSelect = false;
    lock.notify();
    }
    }

    ...

    // In the one thread wanting to select until it has something less boring to do instead.

    // If we are busy, go fast.
    int num = selector.selectNow();
    if (num != 0) {
    return num;
    }
    for (;;) {
    synchronized (lock) {
    while (dontSelect) {
    lock.wait();
    }
    inSelect = true;
    }
    try {
    int num = selector.select();
    if (num != 0) {
    return num;
    }
    } finally {
    synchronized (lock) {
    inSelect = false;
    lock.notifyAll();
    }
    }
    }

    Posted by: tackline on May 31, 2006 at 01:32 PM

  • Oops, with a single lock, that notify should be a notifyAll.

    Posted by: tackline on May 31, 2006 at 01:33 PM

  • Maybe for 1.4 it was slow, but now I think there is no performance problem :-) Also, using SelectionKey.interestOps will not allow you to register on a temporary Selector (but on the same Selector). As for registration on the same main Selector, most of the framework are registering the key just before doing a select(). As an example, Grizzly handles the OP_READ registration by doing:


    public void registerKey(SelectionKey key){
    if ( key == null ) return;
    // add SelectionKey & Op to list of Ops to enable
    keysToEnable.add(key);
    // tell the Selector Thread there's some ops to enable
    selector.wakeup();
    // wakeup() will force the SelectorThread to bail out
    // of select() to process your registered request
    }

    and in the main Selector loop:

    enableSelectionKeys();
    try{
    selectorState = selector.select(selectorTimeout);
    ....
    }
    [......]

    public void enableSelectionKeys(){
    SelectionKey selectionKey;
    int size = keysToEnable.size();
    for (int i=0; i < size; i++) {
    selectionKey = keysToEnable.poll();
    selectionKey.interestOps(
    selectionKey.interestOps() | SelectionKey.OP_READ);
    }
    }

    Wihtout the need for a lock (but a ConcurrentLinkedQueue in that case).


    Posted by: jfarcand on May 31, 2006 at 01:58 PM

  • I think maybe something is wrong with the logic in your blog. You write "the Selector isn't able to let the socketChannel flush the content of the ByteBuffer"

    But don't you actually mean "The socket's outgoing buffer is full, hence all future writes will return 0 until the remote client reads"?

    Posted by: cowwoc on June 01, 2006 at 05:55 AM

  • Agree, I should have used French instead so at least I would have been clear :-) I would say the socket's outgoing buffer is full, hence all future writes will return 0. It Not only the client isn't reading the bytes, but a slow network can also produce such behaviour.

    Thanks!!

    Posted by: jfarcand on June 01, 2006 at 08:01 AM

  • Perhaps I don't fully understand the issues, but I had always assumed OP_WRITE wasn't dealt with because it's virtually useless.

    Even with a state-machine multiplexed reader, at some point the server has consumed and understood a request...at that point a thread is used to handle the request. The response can then be dealt with using normal blocking IO. how am I missing the point?

    TAylor

    Posted by: tcowan on June 01, 2006 at 11:15 AM

  • Taylor, if you have a slow client (mobile phone, dial-up), you might not be able to push out the reponse body fast enough. In such a case, you wouldn't want to waste a thread to block in 'write()' but instead dedicate it to something else until the socket is ready to write again. I have no real-world experience, but it seems that this might not be such a problem nowadays. I just remember many years ago, a popular german website stated this as their suprising number one problem when they redesigned their site - clients being too slow to pick up the responses and handlers piling up. - Matthias

    Posted by: mernst on June 01, 2006 at 11:24 AM

  • Matthias, I agree with this last point from experience. I also endured some problems related to slow connections between a "server" and a "client". In fact, I would seriously consider implementing some mechanism such that a throttled connection would be a normal part of some unit-testing for NIO software.

    Does anyone else have some ideas on how to implement this through software? I was only able to test this using a hardware connection that was known to be slow and spurious. (I suppose that you could insert a proxy that would collect the data, break it up into random chunks, and place random "pauses" between sending the data chunks ...)

    For whatever it's worth, here are some bugs/issues that helped me with poor connections:

    int iSelectKeysAvailable = selector.select(2000);

    In one case, it was appropriate for my code to select() keys for a fixed amount of time. Check for 0 keys as the slow client may not have responded yet. Check for -1 in case the poor connection has completely died. Put a cap on the total time to wait so as to not waste resources on a poor connection.


    int iBytesRead = channel.read(byteBuffer);

    Scenario: A connection is newly established from a "client" eager to send data. The "server's" selection key is readable, but reading the channel returns -1. This indicates that the client connection has died or been terminated (this is true regardless of whether the connection is new). However, if reading the channel returns 0, then the incoming data has not yet completely passed over the poor connection. Wait an "appropriate" amount of time for dribs and drabs of the data to arrive. If the amount of time that it takes to receive the expected data takes "too long", then consider terminating the connection. (If more data than expected arrives [due to a bad client], also consider terminating the connection.)


    Jean-Francois, I look forward to reading Part 2. -Michel

    Posted by: michelsantos on June 04, 2006 at 08:01 AM

  • Hi, Jean, thanks your grizzly, I'm learning nion(etwork).
    Soooooooooooorry my english, but I'm sure that my china is perfect.:-)
    And there is a defect of you example or javadoc?:)
    while ( bb.hasRemaining() ) {
    int len = socketChannel.write(bb);
    if (len

    /**by checking souce socketchannelimpl.java,
    *follow code, in fact, not reachable.
    * len value range 0-LongMax, may be this example is not proper
    * or javadoc should designate that it' value
    throw new EOFException();
    }
    }

    Posted by: qinxian on January 27, 2007 at 12:00 PM

  • oops, I mean in java5:)

    Posted by: qinxian on January 27, 2007 at 12:09 PM

  • Hi qinxian, Can you clarify your question? Thanks.

    Posted by: jfarcand on February 07, 2007 at 01:11 PM


  • while ( bb.hasRemaining() ) {
    int len = socketChannel.write(bb);
    if (len < 0){//---my question just be here
    throw new EOFException();
    }
    /**
    *By checking source socketchannelimpl.java, I think,
    *That above code, in fact, not reachable.
    *The len variable value range between 0 ~ LongMax,
    *Maybe this example is not proper
    *or javadoc should designate that it' value
    */

    }

    Posted by: qinxian on March 21, 2007 at 03:22 AM

  • Even with a state-machine multiplexed reader, at some point the server has consumed and understood a request...at that point a thread is used to handle the request. The response can then be dealt with using normal blocking IO. how am I missing the point? 物流公司, Thomas and Friends, Biometric, 女子泰拳

    Posted by: winbill on December 19, 2007 at 08:50 PM



Only logged in users may post comments. Login Here.


Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds