Skip to main content

Making a JMX connection with a timeout

Posted by emcmanus on May 23, 2007 at 1:23 PM PDT

One question I encounter frequently about the JMX Remote API is
how to reduce the time taken to notice that a remote machine is
dead when making a connection to it. The default timeout is
typically a couple of minutes! Here's one way to do it.

Probably the cleanest technique for connection timeouts in
general is to set a connection timeout on the socket. The idea is
that instead of using...

Socket s = new Socket(host, port);

...you use...

SocketAddress addr = new InetSocketAddress(host, port);
Socket s = new Socket();
s.connect(addr, timeoutInMilliSeconds);

The problem is that this is at a rather low level. If you're
making connections with the JMX Remote API you usually don't see
Socket objects at all. It's still possible to use this
technique, but it requires a certain amount of fiddling, and the
particular fiddling you need depends on which connector protocol
you are using.

A lot of the time, a much simpler and more general technique is
applicable. You simply create the connection in another thread,
and you wait for that thread to complete. If it doesn't complete
before your timeout, you just abandon it. It might still take two
minutes to notice that the remote machine is dead, but in the
meantime you can continue doing other things.

If you're making a lot of connections to a lot of machines, you
might want to think twice about abandoning threads, because you
might end up with a lot of them. But in the more typical case
where you're just making one connection, this technique may well
be for you.

Assuming you're using at least Java SE 5, you'll certainly want
to use href="http://java.sun.com/javase/6/docs/api/java/util/concurrent/package-summary.html">java.util.concurrent
to manage the thread creation and communication. There are a few
ways of doing it, but the easiest is probably a

single-thread
executor.

The method below allows you to connect to a given href="http://java.sun.com/javase/6/docs/api/javax/management/remote/JMXServiceURL.html">JMXServiceURL
with a timeout of five seconds like this:

JMXConnector jmxc = connectWithTimeout(jmxServiceURL, 5, TimeUnit.SECONDS);

My first cut at the problem

In my first version of this entry, I proposed a solution with
the following outline.

JMXConnector connectWithTimeout(JMXServiceURL url, long timeout, TimeUnit unit) {
    ExecutorService executor = Executors.newSingleThreadExecutor();
    Future<JMXConnector> future = executor.submit(new Callable<JMXConnector>() {
public JMXConnector call() {
    return JMXConnectorFactory.connect(url);
}
    });
    return future.get(timeout, unit);
}

Half an hour after posting, I suddenly realised that this
version is incorrect. It reminds of the saying that for every
complex problem there is a solution that is simple, obvious, and
wrong.

This solution does the right thing when the connection succeeds
within the time limit, and also in the case of the problem we
are trying to solve, where it takes a very long time to fail.
But if the connection succeeds after the time limit,
the caller will already have returned, and we'll have made a
connection that nobody knows about!

The second attempt

This is the outline of my second attempt, which I believe is
correct. There are several refinements we'll need to apply
before having a solution that actually works.

// This is just an outline: the real code appears later
JMXConnector connectWithTimeout(JMXServiceURL url, long timeout, TimeUnit unit) {
    final BlockingQueue<Object> mailbox = new ArrayBlockingQueue<Object>(1);
    final ExecutorService executor = Executors.newSingleThreadExecutor();
    executor.submit(new Runnable() {
public void run() {
    JMXConnector connector = JMXConnectorFactory.connect(url);
    if (!mailbox.offer(connector))
connector.close();
}
    });
    Object result = mailbox.poll(timeout, unit);
    if (result == null) {
if (!mailbox.offer(""))
    result = mailbox.take();
    }
    return (JMXConnector) result;
}

To understand how and why this works, notice that exactly one
object always gets posted to the mailbox. There
are three cases:

  • If the connection attempt finishes before the timeout, then
    the connector object will be posted to the
    mailbox and returned to the caller.
  • If the timeout happens, then the main thread will try to
    stuff the mailbox with an arbitrary object (here the empty
    string, but any object would do), so the connection thread
    will realise it has connected too late and close the
    newly-made connection.
  • If the timeout happens at exactly the same time as the
    connection is made, then the main thread may find that the
    mailbox is already full, in which case it again picks up the
    connector object and returns it.

Making it work

The code above is just an outline, and leaves out some
necessary details. We need to refine it in several ways to make
it work.

The first refinement we'll need is exception handling.
The result of the connection attempt could be an exception
instead of a JMXConnector. This doesn't change the reasoning
above, but it does complicate the code.

The main thread calls href="http://java.sun.com/javase/6/docs/api/java/util/concurrent/BlockingQueue.html#poll(long,%20java.util.concurrent.TimeUnit)">BlockingQueue.poll,
which can throw InterruptedException, so we must handle
that.

About half of the final version of connectWithTimeout involves
footering about with exceptions. It's times like this that I'm
inclined to join the checked-exception-haters.

The second refinement is to clean up the connect thread
when we're finished with it. The outline code doesn't call href="http://java.sun.com/javase/6/docs/api/java/util/concurrent/ExecutorService.html#shutdown()">shutdown()
on the ExecutorService, so every time connectWithTimeout is
called, a new single-thread executor is created, and therefore a
new thread. If you're lucky, the garbage-collector will pick up
your executors and their threads at some stage, but you don't
want to depend on luck.

A more subtle point about threads is that the outline code will
create non-daemon threads. Your application will not exit when
the main thread exits if there are any non-daemon threads. So
as written, if you have a thread stuck in a connection attempt
and your application is otherwise finished, it will stay around
until the connection attempt finally times out. That's pretty
much exactly the sort of thing we're trying to avoid. So we'll
need to arrange to create a daemon thread instead.

All right, so here's the real code.

    public static JMXConnector connectWithTimeout(
    final JMXServiceURL url, long timeout, TimeUnit unit)
    throws IOException {
final BlockingQueue<Object> mailbox = new ArrayBlockingQueue<Object>(1);
ExecutorService executor =
Executors.newSingleThreadExecutor(daemonThreadFactory);
executor.submit(new Runnable() {
    public void run() {
try {
    JMXConnector connector = JMXConnectorFactory.connect(url);
    if (!mailbox.offer(connector))
connector.close();
} catch (Throwable t) {
    mailbox.offer(t);
}
    }
});
Object result;
try {
    result = mailbox.poll(timeout, unit);
    if (result == null) {
if (!mailbox.offer(""))
    result = mailbox.take();
    }
} catch (InterruptedException e) {
    throw initCause(new InterruptedIOException(e.getMessage()), e);
} finally {
    executor.shutdown();
}
if (result == null)
    throw new SocketTimeoutException("Connect timed out: " + url);
if (result instanceof JMXConnector)
    return (JMXConnector) result;
try {
    throw (Throwable) result;
} catch (IOException e) {
    throw e;
} catch (RuntimeException e) {
    throw e;
} catch (Error e) {
    throw e;
} catch (Throwable e) {
    // In principle this can't happen but we wrap it anyway
    throw new IOException(e.toString(), e);
}
    }

    private static <T extends Throwable> T initCause(T wrapper, Throwable wrapped) {
wrapper.initCause(wrapped);
return wrapper;
    }

    private static class DaemonThreadFactory implements ThreadFactory {
public Thread newThread(Runnable r) {
    Thread t = Executors.defaultThreadFactory().newThread(r);
    t.setDaemon(true);
    return t;
}
    }
    private static final ThreadFactory daemonThreadFactory = new DaemonThreadFactory();

The initCause method is only used once but it's handy to have
around for those troublesome exceptions that don't have a
Throwable cause parameter.

I think it would be awfully nice if java.util.concurrent
supplied DaemonThreadFactory rather than href="http://www.google.com/search?q=daemonthreadfactory">everyone
having to invent it all the time.

Shouldn't this be simpler?

I admit I'm a bit uncomfortable with the code here. I'd be
happier if I didn't need to reason about it in order to convince
myself that it's correct. But I don't see any simpler way of
using the java.util.concurrent API to achieve the same effect.
Uses of cancel or interrupt tend to lead to race conditions,
where the task can be cancelled after it has already delivered
its result, and again we can get a JMXConnector leak; or we
might close a JMXConnector that the main thread is about to
return. I'd be interested in suggestions for
simplification.

Conclusion of the foregoing

This is a useful technique in many cases, subject to the href="#caution">caution above. It's not limited to the JMX
Remote API, either; you might use it when accessing a remote web
service or EJB or whatever, without having to figure out how to
get hold of the underlying Socket so you can set its timeout.

My thanks to href="http://bass.martin.googlepages.com/">Sébastien
Martin for the discussion that led to this entry.

[Tags: rel="tag">jmx rel="tag">timeout href="http://technorati.com/tag/concurrent"
rel="tag">concurrent.]

Related Topics >>

Comments

<p>I think there's still a possibility for a connection to ...

I think there's still a possibility for a connection to leak..
If the main thread is interrupted while waiting I think we then also need to signal the connecting thread to close the connection...

} catch (InterruptedException e) {
    mailbox.offer(&quot;&quot;);
    throw initCause(new InterruptedIOException(e.getMessage()), e);
} finally {<br />

tcp: connection reset

I run a JMX service embedded in my application. A single client connects using the jmxmp protocol. The client has to manage a large number of JMX connections and has run into issues with the entire application hanging due to some resources being offline and the JMX connection hanging due to the inability to set a timeout.

The client has opted for your first low-level solution which creates a raw socket w/timeout to determine if the client is awake before invoking the JMX connection.

The problem with this is that my service is seeing tcp connection reset exceptions when the client closes the socket. Also, it looks like the exception is thrown and caught within the JVM so there is no way for me to suppress the stack trace from showing up on our production log files.

Two questions:

#1 - Should I believe the client when they say they are closing the socket cleanly?
#2 - Is there anyway to suppress the exception?

Sep 8, 2010 16:46:20.147 GenericConnectorServer ClientCreation.run: WARNING: Failed to open connection: java.net.SocketException: Connection reset java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2266) at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2279) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2750) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:780) at java.io.ObjectInputStream.(ObjectInputStream.java:280) at com.sun.jmx.remote.socket.SocketConnection$ObjectInputStreamWithLoader.(SocketConnection.java:354) at com.sun.jmx.remote.socket.SocketConnection.readMessage(SocketConnection.java:204) at com.sun.jmx.remote.opt.security.AdminServer.connectionOpen(AdminServer.java:76) at com.sun.jmx.remote.generic.ServerSynchroMessageConnectionImpl.connect(ServerSynchroMessageConnectionImpl.java:51) at javax.management.remote.generic.GenericConnectorServer$ClientCreation.run(GenericConnectorServer.java:383) at com.sun.jmx.remote.opt.util.ThreadService$ThreadServiceJob.run(ThreadService.java:208) at com.sun.jmx.remote.opt.util.JobExecutor.run(JobExecutor.java:59)

tcp: connection reset

 If I understand what you are saying, the client first opens a socket connection with a timeout, and if that succeeds it immediately closes the connection and makes a JMXMP connection to the same port. The exception log you are seeing is because the server reacts to the first connection expecting it to be a genuine client, but then that connection was closed before the expected opening traffic. If the second approach I described here is possible for you, then it will avoid that problem (except in the unusual case where the connection succeeds just after the client decides to give up and close it). That is, when the remote port is available, you just connect to it, and use the connection to do your work; and when it's not available, you abandon the thread that is trying to connect. But it does mean that when the connection times out, you get stuck with a thread and an open socket until the TCP layer decides to timeout.

The approach of pinging with a dummy connection first is not completely safe, in that the remote machine could become unavailable after the ping succeeds and before you establish the real connection, so you end up with the full long timeout. If you want to go ahead with it, then it should be *possible* to shut the log messages up with logger configuration, for example on the lines suggested at blogs.sun.com/jmxetc/entry/tracing_jmx_what_s_going , but most likely only by shutting up all javax.management.remote messages. But I think you should probably just be using InetAddress.isReachable in this case.

Ideally you would want to intervene at the point where the JMXMP client creates a socket, so you can give a timeout for the connection right there. It's possible to do that by supplying an implementation of the MessageConnection interface with the "jmx.remote.message.connection" parameter in the environment map for JMXConnectorFactory.connect. The object you supply could be a subclass of com.sun.jmx.remote.socket.SocketConnection that calls that class's (Socket,ClassLoader) constructor with a Socket that has been made using the timeout. But that's some serious black magic.

tcp: connection reset

If I understand you correctly, the last paragraph in your reply would be implemented on the client side correct? I like the solution and I think the client would be willing to implement it but It appears that the SocketConnection class is implemented in jmxremote_optional.jar. I could not find any javadocs for this class nor can I seem to locate the source code so I'm not sure how I or my client would begin to extend the class.

As an alternative, Is there a byte sequence the client can send to fool the MBean server into thinking it is a genuine JMXMP connection before closing the raw socket?

InetAddress.isReachable

 I still think that the best solution is just to use InetAddress.isReachable.

You could try to mimic the JMXMP handshake sequence (which is detailed in the JSR 160 spec), but if the server has configured security properly that will probably be a tough job.

InetAddress.isReachable

InetAddress.isReachable will tell you if the host is reachable but, unfortunately, won't be able to tell you that the JMXMP listener is up.