Skip to main content

Making a JMX connection with a timeout

Posted by emcmanus on May 23, 2007 at 4:23 PM EDT

One question I encounter frequently about the JMX Remote API is how to reduce the time taken to notice that a remote machine is dead when making a connection to it. The default timeout is typically a couple of minutes! Here's one way to do it.

Probably the cleanest technique for connection timeouts in general is to set a connection timeout on the socket. The idea is that instead of using...

Socket s = new Socket(host, port);

...you use...

SocketAddress addr = new InetSocketAddress(host, port);
Socket s = new Socket();
s.connect(addr, timeoutInMilliSeconds);

The problem is that this is at a rather low level. If you're making connections with the JMX Remote API you usually don't see Socket objects at all. It's still possible to use this technique, but it requires a certain amount of fiddling, and the particular fiddling you need depends on which connector protocol you are using.

A lot of the time, a much simpler and more general technique is applicable. You simply create the connection in another thread, and you wait for that thread to complete. If it doesn't complete before your timeout, you just abandon it. It might still take two minutes to notice that the remote machine is dead, but in the meantime you can continue doing other things.

If you're making a lot of connections to a lot of machines, you might want to think twice about abandoning threads, because you might end up with a lot of them. But in the more typical case where you're just making one connection, this technique may well be for you.

Assuming you're using at least Java SE 5, you'll certainly want to use java.util.concurrent to manage the thread creation and communication. There are a few ways of doing it, but the easiest is probably a single-thread executor.

The method below allows you to connect to a given JMXServiceURL with a timeout of five seconds like this:

JMXConnector jmxc = connectWithTimeout(jmxServiceURL, 5, TimeUnit.SECONDS);

My first cut at the problem

In my first version of this entry, I proposed a solution with the following outline.

JMXConnector connectWithTimeout(JMXServiceURL url, long timeout, TimeUnit unit) {
    ExecutorService executor = Executors.newSingleThreadExecutor();
    Future<JMXConnector> future = executor.submit(new Callable<JMXConnector>() {
	public JMXConnector call() {
	    return JMXConnectorFactory.connect(url);
	}
    });
    return future.get(timeout, unit);
}

Half an hour after posting, I suddenly realised that this version is incorrect. It reminds of the saying that for every complex problem there is a solution that is simple, obvious, and wrong.

This solution does the right thing when the connection succeeds within the time limit, and also in the case of the problem we are trying to solve, where it takes a very long time to fail. But if the connection succeeds after the time limit, the caller will already have returned, and we'll have made a connection that nobody knows about!

The second attempt

This is the outline of my second attempt, which I believe is correct. There are several refinements we'll need to apply before having a solution that actually works.

// This is just an outline: the real code appears later
JMXConnector connectWithTimeout(JMXServiceURL url, long timeout, TimeUnit unit) {
    final BlockingQueue<Object> mailbox = new ArrayBlockingQueue<Object>(1);
    final ExecutorService executor = Executors.newSingleThreadExecutor();
    executor.submit(new Runnable() {
	public void run() {
	    JMXConnector connector = JMXConnectorFactory.connect(url);
	    if (!mailbox.offer(connector))
		connector.close();
	}
    });
    Object result = mailbox.poll(timeout, unit);
    if (result == null) {
	if (!mailbox.offer(""))
	    result = mailbox.take();
    }
    return (JMXConnector) result;
}

To understand how and why this works, notice that exactly one object always gets posted to the mailbox. There are three cases:

  • If the connection attempt finishes before the timeout, then the connector object will be posted to the mailbox and returned to the caller.
  • If the timeout happens, then the main thread will try to stuff the mailbox with an arbitrary object (here the empty string, but any object would do), so the connection thread will realise it has connected too late and close the newly-made connection.
  • If the timeout happens at exactly the same time as the connection is made, then the main thread may find that the mailbox is already full, in which case it again picks up the connector object and returns it.

Making it work

The code above is just an outline, and leaves out some necessary details. We need to refine it in several ways to make it work.

The first refinement we'll need is exception handling. The result of the connection attempt could be an exception instead of a JMXConnector. This doesn't change the reasoning above, but it does complicate the code.

The main thread calls BlockingQueue.poll, which can throw InterruptedException, so we must handle that.

About half of the final version of connectWithTimeout involves footering about with exceptions. It's times like this that I'm inclined to join the checked-exception-haters.

The second refinement is to clean up the connect thread when we're finished with it. The outline code doesn't call shutdown() on the ExecutorService, so every time connectWithTimeout is called, a new single-thread executor is created, and therefore a new thread. If you're lucky, the garbage-collector will pick up your executors and their threads at some stage, but you don't want to depend on luck.

A more subtle point about threads is that the outline code will create non-daemon threads. Your application will not exit when the main thread exits if there are any non-daemon threads. So as written, if you have a thread stuck in a connection attempt and your application is otherwise finished, it will stay around until the connection attempt finally times out. That's pretty much exactly the sort of thing we're trying to avoid. So we'll need to arrange to create a daemon thread instead.

All right, so here's the real code.

    public static JMXConnector connectWithTimeout(
	    final JMXServiceURL url, long timeout, TimeUnit unit)
	    throws IOException {
	final BlockingQueue<Object> mailbox = new ArrayBlockingQueue<Object>(1);
	ExecutorService executor =
		Executors.newSingleThreadExecutor(daemonThreadFactory);
	executor.submit(new Runnable() {
	    public void run() {
		try {
		    JMXConnector connector = JMXConnectorFactory.connect(url);
		    if (!mailbox.offer(connector))
			connector.close();
		} catch (Throwable t) {
		    mailbox.offer(t);
		}
	    }
	});
	Object result;
	try {
	    result = mailbox.poll(timeout, unit);
	    if (result == null) {
		if (!mailbox.offer(""))
		    result = mailbox.take();
	    }
	} catch (InterruptedException e) {
	    throw initCause(new InterruptedIOException(e.getMessage()), e);
	} finally {
	    executor.shutdown();
	}
	if (result == null)
	    throw new SocketTimeoutException("Connect timed out: " + url);
	if (result instanceof JMXConnector)
	    return (JMXConnector) result;
	try {
	    throw (Throwable) result;
	} catch (IOException e) {
	    throw e;
	} catch (RuntimeException e) {
	    throw e;
	} catch (Error e) {
	    throw e;
	} catch (Throwable e) {
	    // In principle this can't happen but we wrap it anyway
	    throw new IOException(e.toString(), e);
	}
    }

    private static <T extends Throwable> T initCause(T wrapper, Throwable wrapped) {
	wrapper.initCause(wrapped);
	return wrapper;
    }

    private static class DaemonThreadFactory implements ThreadFactory {
	public Thread newThread(Runnable r) {
	    Thread t = Executors.defaultThreadFactory().newThread(r);
	    t.setDaemon(true);
	    return t;
	}
    }
    private static final ThreadFactory daemonThreadFactory = new DaemonThreadFactory();

The initCause method is only used once but it's handy to have around for those troublesome exceptions that don't have a Throwable cause parameter.

I think it would be awfully nice if java.util.concurrent supplied DaemonThreadFactory rather than everyone having to invent it all the time.

Shouldn't this be simpler?

I admit I'm a bit uncomfortable with the code here. I'd be happier if I didn't need to reason about it in order to convince myself that it's correct. But I don't see any simpler way of using the java.util.concurrent API to achieve the same effect. Uses of cancel or interrupt tend to lead to race conditions, where the task can be cancelled after it has already delivered its result, and again we can get a JMXConnector leak; or we might close a JMXConnector that the main thread is about to return. I'd be interested in suggestions for simplification.

Conclusion of the foregoing

This is a useful technique in many cases, subject to the caution above. It's not limited to the JMX Remote API, either; you might use it when accessing a remote web service or EJB or whatever, without having to figure out how to get hold of the underlying Socket so you can set its timeout.

My thanks to Sébastien Martin for the discussion that led to this entry.

[Tags: .]

Related Topics >>

Comments

<p>I think there's still a possibility for a connection to ...

I think there's still a possibility for a connection to leak..
If the main thread is interrupted while waiting I think we then also need to signal the connecting thread to close the connection...

} catch (InterruptedException e) {
    mailbox.offer("");
    throw initCause(new InterruptedIOException(e.getMessage()), e);
} finally {

tcp: connection reset

I run a JMX service embedded in my application. A single client connects using the jmxmp protocol. The client has to manage a large number of JMX connections and has run into issues with the entire application hanging due to some resources being offline and the JMX connection hanging due to the inability to set a timeout.

The client has opted for your first low-level solution which creates a raw socket w/timeout to determine if the client is awake before invoking the JMX connection.

The problem with this is that my service is seeing tcp connection reset exceptions when the client closes the socket. Also, it looks like the exception is thrown and caught within the JVM so there is no way for me to suppress the stack trace from showing up on our production log files.

Two questions:

#1 - Should I believe the client when they say they are closing the socket cleanly?
#2 - Is there anyway to suppress the exception?

Sep 8, 2010 16:46:20.147 GenericConnectorServer ClientCreation.run: WARNING: Failed to open connection: java.net.SocketException: Connection reset java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2266) at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2279) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2750) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:780) at java.io.ObjectInputStream.(ObjectInputStream.java:280) at com.sun.jmx.remote.socket.SocketConnection$ObjectInputStreamWithLoader.(SocketConnection.java:354) at com.sun.jmx.remote.socket.SocketConnection.readMessage(SocketConnection.java:204) at com.sun.jmx.remote.opt.security.AdminServer.connectionOpen(AdminServer.java:76) at com.sun.jmx.remote.generic.ServerSynchroMessageConnectionImpl.connect(ServerSynchroMessageConnectionImpl.java:51) at javax.management.remote.generic.GenericConnectorServer$ClientCreation.run(GenericConnectorServer.java:383) at com.sun.jmx.remote.opt.util.ThreadService$ThreadServiceJob.run(ThreadService.java:208) at com.sun.jmx.remote.opt.util.JobExecutor.run(JobExecutor.java:59)

tcp: connection reset

 If I understand what you are saying, the client first opens a socket connection with a timeout, and if that succeeds it immediately closes the connection and makes a JMXMP connection to the same port. The exception log you are seeing is because the server reacts to the first connection expecting it to be a genuine client, but then that connection was closed before the expected opening traffic. If the second approach I described here is possible for you, then it will avoid that problem (except in the unusual case where the connection succeeds just after the client decides to give up and close it). That is, when the remote port is available, you just connect to it, and use the connection to do your work; and when it's not available, you abandon the thread that is trying to connect. But it does mean that when the connection times out, you get stuck with a thread and an open socket until the TCP layer decides to timeout.

The approach of pinging with a dummy connection first is not completely safe, in that the remote machine could become unavailable after the ping succeeds and before you establish the real connection, so you end up with the full long timeout. If you want to go ahead with it, then it should be *possible* to shut the log messages up with logger configuration, for example on the lines suggested at blogs.sun.com/jmxetc/entry/tracing_jmx_what_s_going , but most likely only by shutting up all javax.management.remote messages. But I think you should probably just be using InetAddress.isReachable in this case.

Ideally you would want to intervene at the point where the JMXMP client creates a socket, so you can give a timeout for the connection right there. It's possible to do that by supplying an implementation of the MessageConnection interface with the "jmx.remote.message.connection" parameter in the environment map for JMXConnectorFactory.connect. The object you supply could be a subclass of com.sun.jmx.remote.socket.SocketConnection that calls that class's (Socket,ClassLoader) constructor with a Socket that has been made using the timeout. But that's some serious black magic.

tcp: connection reset

If I understand you correctly, the last paragraph in your reply would be implemented on the client side correct? I like the solution and I think the client would be willing to implement it but It appears that the SocketConnection class is implemented in jmxremote_optional.jar. I could not find any javadocs for this class nor can I seem to locate the source code so I'm not sure how I or my client would begin to extend the class.

As an alternative, Is there a byte sequence the client can send to fool the MBean server into thinking it is a genuine JMXMP connection before closing the raw socket?

InetAddress.isReachable

 I still think that the best solution is just to use InetAddress.isReachable.

You could try to mimic the JMXMP handshake sequence (which is detailed in the JSR 160 spec), but if the server has configured security properly that will probably be a tough job.

InetAddress.isReachable

InetAddress.isReachable will tell you if the host is reachable but, unfortunately, won't be able to tell you that the JMXMP listener is up.