Skip to main content

Reimplementing the RMI protocol

Posted by emcmanus on January 11, 2007 at 3:10 AM PST

In my href="http://weblogs.java.net/blog/emcmanus/archive/2006/12/securing_the_rm.html">last
entry, I mentioned that I had reimplemented the RMI
registry portably, before discovering that there was a much
simpler solution to the security problem I was addressing.
Here's the reimplementation for what it's worth. It allows you
to go further than the socket factory hack. And if you ever
need to understand gory details of the RMI protocol, this could
come in useful. Because it does much less than the full-blown
RMI implementation in the JDK, it's much easier to
understand.

Here are some of things you can do with a reimplemented RMI
registry that you can't do with just socket factories:

  • Show different subsets of the registry contents to different
    clients, for example depending on where they are connecting from
    or whether they have authenticated with SSL.
  • Only allow a bind or unbind operation to succeed if it adds
    a hidden password to the name. So if you want to bind
    "foo", the string you give would actually be
    "foosecretpassword". Anything else will
    fail.
  • Trigger additional actions when an object is bound in the
    registry, for example persisting the object or replicating it in
    another registry.

By the way, most of these advantages would evaporate if the
registry API were augmented to allow you to supply your own
Registry object
to href="http://java.sun.com/javase/6/docs/api/java/rmi/registry/LocateRegistry.html#createRegistry(int)">LocateRegistry.createRegistry.

My reimplementation separates out the registry functionality
and the RMI reimplementation. The registry functionality is in
the class RegistryImpl, which is almost completely
uninteresting, except possibly for the use of href="http://java.sun.com/javase/6/docs/api/java/util/concurrent/ConcurrentHashMap.html">ConcurrentHashMap
to avoid having to synchronize explicitly across
modifications.

The class RegistryServer is where the real action happens. Its
public API consists of just a constructor that takes a Registry
parameter (typically an instance of RegistryImpl or a subclass)
and a port number. It could reasonably have a close method as
well, which is left as the proverbial exercise for the reader.

If you're interested in the details of RMI's wire protocol, you
could do worse than to study this class in addition to the href="http://java.sun.com/javase/6/docs/platform/rmi/spec/rmi-protocol.html">official
specification, especially since the latter has some href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4674902">known
errors.

At a high level, an RMI request looks like (object-id,
opnum, hash). The object-id is the href="http://java.sun.com/javase/6/docs/api/java/rmi/server/ObjID.html">ObjId
of the remote object being invoked. For the RMI registry, this
must be the distinguished value ObjID( href="http://java.sun.com/javase/6/docs/api/java/rmi/server/ObjID.html#REGISTRY_ID">REGISTRY_ID)
because that is the id that href="http://java.sun.com/javase/6/docs/api/java/rmi/registry/LocateRegistry.html#getRegistry(java.lang.String,%20int)">LocateRegistry.getRegistry
will send. Since we are not exporting any other RMI objects on
our port, we don't actually need to check the id, but we do
anyway.

The RMI protocol exists in two variants, the 1.1 variant that
was used in JDK 1.1, and the 1.2 variant that was added in JDK
1.2. In the 1.1 variant, the opnum is an index within the
methods of the given Remote interface, and the hash is a
checksum or href="http://en.wikipedia.org/wiki/Cryptographic_hash_function">digest to make sure that both ends have the same understanding
of what these methods are and what order they appear in. In the
1.2 variant, the opnum is -1 and the hash is a
digest of the method signature.

For compatibility reasons, LocateRegistry.getRegistry only uses
the 1.1 variant, but I have coded the server to recognize both,
even though I can't think of any reason you would need the 1.2
variant in this case.

In addition to these variants, there are three communication
styles
. The "single op protocol" is used by RMI-over-HTTP and sends a
single operation over a connection before closing it. The "stream
protocol" can send any number of operations over a connection, but
only one at a time. The "multiplex protocol" can send any number
of operations, and can send more than one operation in
parallel.

We're only interested in the stream protocol here because
that's what LocateRegistry.getRegistry uses. In fact, in the
JDK no RMI client will ever use the multiplex protocol, which is
a pity because it would allow a more efficient use of TCP
connections. The JDK can handle the multiplex protocol if it
receives a client request that uses it, but will never originate
such a request. That's my understanding from reading the code,
at least.

One thing that is not immediately clear from the protocol
specification is exactly when RMI switches over to using object
serialization
. You can see this in the reimplementation by
looking at when we switch over from using DataInputStream and
DataOutputStream to using ObjectInputStream and
ObjectOutputStream.

The RMI protocol is not self-describing, in that you have to
know the signature of the method being invoked in order to read
its parameters and write its return value correctly. I think
this is a flaw, by the way.

Another thing that is not very obvious from reading the
specification is that you must include "class
annotations" in the serialized data you send back, even if they
are null. This is why we subclass ObjectOutputStream and
override href="http://java.sun.com/javase/6/docs/api/java/io/ObjectOutputStream.html#annotateClass(java.lang.Class)">annotateClass.

I've done some basic testing of this code, but it is obviously
not production-quality. Some performance tuning could be done
(though performance of the registry is not usually critical). I
am not sure that protocol errors will produce the same
exceptions as "real" RMI, though I am not sure I care
either.

The source files are href="http://weblogs.java.net/blog/emcmanus/archive/RegistryImpl.java">RegistryImpl.java
(the boring implementation of the Registry interface) and href="http://weblogs.java.net/blog/emcmanus/archive/RegistryServer.java">RegistryServer.java
(the exciting implementation of a subset of the RMI protocol).

Related Topics >>