The Source for Java Technology Collaboration
User: Password:



Eamonn McManus

Eamonn McManus's Blog

When can JMX notifications be lost?

Posted by emcmanus on August 23, 2007 at 12:32 AM | Comments (6)

The JMX Best Practices guide says notifications can sometimes be lost. Why is that? When might it happen? Read on.

Here's the relevant text from the Best Practices guide:

It is important to be aware of the semantics of notification delivery when defining how notifications are used in a model. Remote clients cannot assume that they will receive all notifications for which they are listening. The JMX Remote API only guarantees a weaker condition:

A client either receives all notifications for which it is listening, or can discover that notifications may have been lost.

This text might seem somewhat alarming. First of all, notice that it only applies to remote clients. A local client (within the same Java VM) will reliably get all notifications it asks for.

Secondly, the text is describing something that will only happen in unusual circumstances. Notifications will only be lost when they arrive so fast that they cannot be delivered to the remote client quickly enough, or if there is a long network outage during which enough notifications arrive to overflow the notification buffer on the server. If you're sure that the rate of notifications is always low then you probably don't need to worry. Long network outages will probably trigger other problems in your client, so you'll need to deal with them more generally than just worrying about lost notifications.

Careful clients

But if you have many notifications, you probably want to follow the advice in the subsequent paragraphs of the Best Practices guide:

Notifications should never be used to deliver information that is not also available in another way. The typical client observes the initial state of the information model, then reacts to changes in the model signalled by notifications. If it sees that notifications may have been lost, it goes back and observes the state of the model again using the same logic as it used initially. The information model must be designed so that this is always possible. Losing a notification must not mean losing information irretrievably.

When a notification signals an event that might require intervention from the client, the client should be able to retrieve the information needed to react. This might be an attribute in an MBean that contains the same information as was included in the notification. If the information is just that a certain event occurred, it is often enough just to have a counter of how many times it occurred. Then a client can detect that the event occurred just by seeing that the counter has changed.

Stateless servers

The design of the existing standard connectors is such that notification loss can happen when there are many notifications coming from the MBeans in the MBean Server. This is true even for clients that are only listening for a small subset of those notifications. In the extreme case, a client that is listening for a very rare notification might not see it, because other MBeans are generating frequent notifications that nobody is listening to. Once again, the client can tell that this has happened (via JMXConnector.addConnectionNotificationListener).

The existing connectors behave like this because they have been designed to have no non-transient state on the server. A consequence is that the server has no non-transient record of which clients are interested in which notifications. Therefore it has to store all notifications in its buffer, in case some client it doesn't remember is interested in them.

The servers were designed to have no non-transient state for better scalability. In retrospect, this was probably a design mistake. In many client/server systems, you have one server, or just a few servers, and a large number of clients. So limiting state in the server is an excellent idea, because it allows the server to handle many more clients. But in management systems, the situation is usually the opposite: you typically have one client (a management program such as JConsole) that may connect to and manage many servers. There are no common use cases where a server might have a large number of JMX clients.

In version 2.0 of the JMX API, being defined by JSR 255, we are adding an Event Service. Among other things, this will fix the problem where a client might lose notifications that it is interested in because there are many other notifications that it is not interested in.

Notification loss is inevitable

Even with the new Event Service, notification loss will still be possible, however. Can't we get rid of it?

To answer this question, consider what happens when notifications are produced faster than they can be handled. This might be because of network delays, or because the client needs to do some work for each notification. Suppose this situation persists. What should the system do?

There are basically three possibilities:

  1. Some notifications are eventually dropped. This is what the JMX Remote API does, and it is also what the new Event Service will do.
  2. Notification senders are slowed down. This is what usually happens in the local case. An MBean sends a notification to a local listener by invoking the listener's handleNotification method. Unless it has multiple threads, the MBean will wait for that method to complete before doing anything else, including sending any more notifications.
  3. Notifications accumulate in an unbounded buffer. This is actually the worst solution. In the real world there is no such thing as an unbounded buffer. And even if you save the notifications in a giant disk, which is effectively unbounded, you still haven't fixed the problem that the client is getting further and further behind the server. When the client finally gets a notification that was sent yesterday, is that still any use?

When we were designing the JMX Remote API, we assumed that most MBeans that send notifications were not expecting sending to be slow. In the local case, sending is just invoking a method, and that method is usually punctual. If we had wanted to apply solution 2, slowing down senders, that could have broken the assumptions of existing MBeans. Coding MBeans so that they can cope with a blocked send would also be considerably more difficult. So, even though this solution (flow control) is arguably better, we were reluctant to impose it.

The future: JMX Event Service

As I mentioned, in version 2.0 of the JMX API we are designing a new Event Service. This will be part of the JDK 7 platform. Though it will not eliminate notification loss, it will significantly reduce the likelihood of such loss. And it will also allow you to plug in your own transport for notifications. In particular you could plug in the Java Message Service to use an existing message bus.

[Tags: .]


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • Hi,
    I'm working on a jmx manager that must collect informations about various mbean servers.
    Now I'm investigating in how to recover a connection loss with a mbeanServer. Jmx offers me some mechanism to detect a connection loss?

    thanks

    Posted by: lukebike on January 30, 2008 at 11:36 PM

  • Yes. See JMXConnector.addConnectionNotificationListener.

    Posted by: emcmanus on January 31, 2008 at 02:15 AM

  • thanks Eamonn.
    I'm working on an application built using Tiger.
    So your answer suggest me that its really important to starting use Mustang

    thanks again

    Posted by: lukebike on January 31, 2008 at 04:55 AM

  • Using JDK 6 is generally better than using JDK 5.0! :-) But the method in question also exists in 5.0.

    Posted by: emcmanus on January 31, 2008 at 04:58 AM

  • ups ...I'm sorry!!!
    thank again

    Posted by: lukebike on January 31, 2008 at 06:08 AM

  • I've implemented my connection listener but it seems that when connection goes down I don't receive notification, instead an not managed exception has been raised.

    I don't know if could be a problem related to server connector or something else...

    thanks
    have a nice Week End

    Posted by: lukebike on February 01, 2008 at 01:44 AM



Only logged in users may post comments. Login Here.


Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds