Skip to main content

The Capability Pattern - Future-Proof Your APIs

Posted by timboudreau on August 11, 2008 at 1:00 AM PDT

Here is a simple pattern which you can use to make your
APIs extensible, even by third parties, without sacrificing
your ability to keep backward compatibility.

It is very frequent to create a library which has two
“sides” — an API side and an
SPI side. The API is what applications call
to use the library. The SPI (Service Provider Interface)
is how functionality —
for example, access to different kinds of resources, is
provided.

One example of this is JavaMail: To read/write email messages,
you call JavaMail's API. Under the hood, when you ask for a
mail store for, say, an IMAP mail server, the JavaMail library
looks up all the providers registered (injected) on the
classpath, and tries to find one that supports that protocol.
The protocol handler is written to JavaMail's SPI.
If it finds one, then you can fetch messages from IMAP servers
using it. But your client code only ever calls the JavaMail
API - it doesn't need to know anything about the IMAP service
provider under the hood.



apispi.png


There is one very big problem with the way this is usually
done: API classes
really ought to be final in almost all cases.
SPI classes ought to be abstract classes unless the problem
domain is extremely well-defined, in which case interfaces
make sense (you can use either, but in a not-well-defined problem
domain you may end up, over time, creating things with awful names
like LayoutManager2).

I won't go into great detail about why this is true here (my friend Jarda
does in his new book and
we discuss it somewhat in our book
Rich Client Programming).
In abbreviated form, the reasons are:

  1. You can provably backward compatibly add methods
    to a final class. And if the class is final, that fact has communication-value — it communicates to the user of that class that it's not something they might need to implement, where an interface would be more confusing.
  2. You can backward compatibly remove methods from an SPI
    interface or abstract class, if
    your library is the only thing that will ever call the
    SPI directly is your library. Older implementations will
    still have the method, it just will never be called (in
    a modular environment such as the NetBeans module system,
    OSGi or presumably JSR-277, you would enforce this by
    putting the API and SPI in separate JAR files, so a
    client can't even see the SPI classes).

A minor benefit of using abstract classes is that
you can semi-compatibly add non-abstract
methods to an abstract class later. But do remember that
you run the risk that someone will have a subclass with the
same method name and arguments and an incompatible
return-type (the JDK actually did this to us once in
NetBeans, by adding Exception.getCause() in JDK 1.3).
So adding methods to a public, non-final class in an API is
a backward-incompatible change.

Given those constraints, what happens if you mix API and SPI
in the same class (which is what JavaMail and most Java standards
do)? Well, you can't add methods compatibly because that could
break subclasses. And you can't remove them compatibly, because
clients could be calling them. You're stuck. You can't compatibly add
or remove anything from the existing classes.

As I've written elsewhere, it is the height of insanity that an
application server vendor is supposed to implement interfaces
and classes that its clients directly call — for exactly
this reason. It would be much cleaner, and allow Java APIs to
evolve much faster, if API and SPI were completely separated.

But part of the appeal to vendors, for better or worse, to
implement these specifications, is that they can extend them
in custom ways that will tie developers who use those
extensions to their particular implementation. This behavior not
entirely about being evil and locking people in. There is
a genuine case for innovation on top of a standard - that's
how standards evolve, and some people will need functionality
that the standard doesn't yet support.

Enter the capability pattern. The capability pattern is very, very
simple. It looks like this:

public <T> getCapability (Class<T> type);
     

That's it! It's incredibly simple!
It has one caveat: Any call to getCapability() must
be followed by a null-check
. But this is much cleaner
than either catching UnsupportedOperationExceptions,
or if (foo.isAbleToDoX()) foo.doX() or
if (foo instanceof DoerOfX) ((DoerOfX) foo).doX().
A null-check is nice and simple and clean by comparison. It's
letting the Java type system work for you instead of
getting into a wrestling match with it.

Now, what can you do with it? Here's an example.
In my previous
blog
I introduced an alternative design for how you could
do something like SwingWorker.
It contains a class called TaskStatus, which abstracts the
task status data from the task-performing object itself. It is a simple
interface with setters that allow a background thread to inform
another object (presumably a UI) about the progress of a task.

In light of what we just discussed, TaskStatus really
ought to be a final class. So let's rewrite it a little, to
look like this. We will use a mirror-class for the SPI.

public final class TaskStatus {
    private final StatusImpl impl;
    TaskStatus (StatusImpl impl) {
        this.impl = impl;
    }

    public void setTitle (String title) {
        impl.setTitle (title);
    }

    public void setProgress (String msg, long progress, long min, long max) {
        //We could do argument sanity checks here and make life
        //simpler for anyone implementing StatusImpl
        impl.setProgress (msg, progress, min, max);
    }

    public void setProgress (String msg) {
        //...you get the idea
        //...
}

public abstract class StatusImpl {
    public abstract void setTitle (String title);
    public abstract void setProgress (String msg, long progress, long min, long max);
    public abstract void setProgress (String msg); //indeterminate mode
    public abstract void done();
    public abstract void failed (Exception e);
}
     

So we have an API that handles basic status display. But people
are going to invent new aspects to status display. We can't
save the world and solve everybody's task-status problems before
they even think of them - and we shouldn't try.

We don't
want to set things up so that it's up to us to implement everything
the world will ever want. Luckily, it doesn't have to be that
way.

Since we've designed our API so that
it can be compatibly added to, we let the rest of the world come up
with things they need for displaying task status, and the ones
that a lot of people need can be added to our API in the future.
The capability pattern lets us do that. We add two methods to
our API and SPI classes:

public abstract class StatusImpl {
    //...
    public <T> T getCapability (Class<T> type);
}
public final class TaskStatus {
    //...
    public <T> T getCapability (Class<T> type) {
        return impl.getCapability (type);
    }
}
   

Let's put that to practical use. Someone might
want to display how much time remains
before the task is done. Our API doesn't handle that.
Through the capability pattern, we can add that. We
(or anyone implementing StatusImpl) can
create the following interface:

public interface StatusTime {
    public void setTimeRemaining (long milliseconds);
}
     

A task that wants to provide this information to the UI, if the
UI supports it, simply does this:
public T runInBackground (TaskStatus status) {
   StatusTime time = status.getCapability (StatusTime.class);
   for (...) {
      //do some slow work...
      if (time != null) {
          long remaining = //estimate the time remaining
          time.setTimeRemaining (remaining);
      }
   }
}
     

Even better, our Task API is, right now, not tied
specifically to Swing or AWT - it could be used for anything
that needs to follow the pattern of computing something on a
background thread and then doing work on another one. Why not
keep it un-tied to UI toolkits? All we have to do is make the
code that actually handles the threading pluggable (I'll talk
about how you do this simply using the Java classpath for dependency
injection in my next blog). Then the result could be used with
SWT or Thinlet as well, or even in a server-side application.
Instead of a SwingWorker, we have an
AnythingWorker!

But we know we need a UI - and we know we are targetting Swing
right now. How can we really keep this code completely un-tied
from UI code and still have it be useful?

The capability pattern comes to our rescue again - very very
simply. An actual application using this UI simply
fetches the default factory for StatusImpls (you need such a thing
if you want to run multiple simultaneous background tasks and show status for
each — my next blog
will explain how this can be injected just by putting a JAR
on the classpath) and does something like:

Component statusUi = theFactory.getCapability (Component.class);
if (statusUi != null) {
    statusBar.add (statusUi);
}
     

(or if we want to allow only one background task at a time,
we can forget the factory and put the Component fetching
code directly in our implementation of StatusImpl).

If you are familiar with NetBeans
Lookup API
, the capability pattern is really a simplification
of that (minus collection-based results and listening for changes).

The point here is that the capability pattern lets you have an
API that is composed completely of nice, future-proofed,
evolvable, final classes, but the API is extensible
even though it is final
. The result is that the API can evolve
faster, with fewer worries about breaking anybody's existing
code. Which reduces the cycle time to improve existing libraries,
and all our software evolves and improves faster, which is good
for everyone.

It also helps one to avoid trying to “save the world” —
by allowing for extensibility, it is possible to create an API that is
useful without needing to handle every possible thing anyone might
ever want to do in that problem domain. Trying to save the world is
what leads to scope-creep and never-finished projects.
In this tutorial I discuss the don't try to save the world
principle in a practical example.

Does the mirror-class design seem a bit masochistic? I think it
does point up a weakness in the scoping rules of the Java language.
It would definitely be nicer to be able to, on the method level, make
some methods visible to some kinds of clients, and other methods
visible to other kinds of clients. But regardless of this, it's even
more masochistic to end up “painted into a corner,”[1]
and unable to fix bugs or add features without potentially breaking
somebody's code. That's how
you end up with ten-year-old unfixed bugs.

[1]painted into a corner &mdash
An English idiom meaning to leave yourself with no options — you were
painting the floor of a room in a pattern such
that you end up standing in an unpainted corner of the room, and
you can't leave the corner until the paint dries.

Related Topics >>

Comments

Argh, yes, it ate the tags. Let's see if it likes SGML entities (they've turned off HTML replies for java.net due to all the porn urls we were getting posted as comments on our blogs - hopefully its escaping isn't that smart...): public <T extends IService> getCapability (Class <T> type)

@timboudreau: Sorry, didn't parse that...did the forum software eat your last post? Best, Laird

@liquid: I never cared for the definition of IAdaptable - it implies that I think I'm getting "this object as some other type" which just adds conceptual overhead - nothing is actually being adapted (and there are uses of the Adapter pattern which have nothing to do with the code pattern described above). But yes, code-wise it is the same pattern. @mrmorris: True, if you don't document what capability interfaces *might* be available, or can't generally predict it, then the pattern is non-intuitive and should not be used. I don't think it should be used for ad-hoc things, much less everything, because you end up in a wasteland where every class is a "magic bag of stuff." It needs to be used judiciously.

@mrmorris, @timboudreau: You can package up the additional capabilities into their own class or package, making a kind of new "facet" for a given object. I had fun writing about this several years ago (http://weblogs.java.net/blog/ljnelson/archive/2004/09/seventeenth_cen.html).

Actually, it might even be "Extension Object" (AKA "Extension Interface"), whose intent is "Anticipate that an object's interface needs to be extended in the future. Extension Object lets you add interfaces to a class and lets clients query whether an object has a particular extension." Regarding my Eclipse related comment: public interface IAdaptable { public Object getAdapter(Class adapter); } Quote from the book: "What is behind the IAdaptable name? [snip]. Over time the name got changed to IAdaptable _to emphasize the fact that the mechanism enables adapting an existing class to another interface_." (emphasis mine) You explain very well the benefits you gain from using the pattern, i'm just not sure we need to give it a new name that's all.

By great minds, i meant you and the eclipse guys, not me, of course :)

The getCapability method, to me, effectively adapts the current object to respond to the richer API. The old api is incompatible with the new one (here for backwards compatibility and future proofing reasons), the getCapability makes them compatible. Is my reasoning far fetched ? Great minds seem to think alike, in eclipse they have the exact same extensible concept (IAdaptable IIRC, I believe i read this in gamma's and beck book about eclipse, i'll try to find a more precise reference if anyone wants it)

@liquid: Call what Adapter? If you mean the "mirror-class" term, I can see why one would think of it as an adapter, but generally an adapter is something that resolves an "impedance mismatch" or makes two incompatible things compatible. I don't think a final API class that is simply a pass-through to an abstract SPI class is actually "adapting" anything. But perhaps you meant something else.

We tend to call this Adapter around here

Interesting article, I have follow same kind of pattern to design http://code.google.com/p/cubeon/ API/SPI to provide Capabilities and features (Doc is still pending) feel free to review it and let me know

What I love about this pattern is how well it composes over a multitude of scenarios whether you need 1:M association, publish-subscriber support, bound types etc. without pulling in dependencies. What I do not like about it is how it becomes hard to know up-front at design time, what kind of services are exposed. Which I why I would only use the pattern for relatively course grained stuff.

Just trying to be clear here, is the Capability concept in addition to some core API that is also implemented, or in lieu of it?

I mean, this tastes like a mini-Lookup system that's API specific, or perhaps better put, API scoped.

But, at the high level, this seems little different (conceptually) than just simply tacking a Map on to any Class you want to provide for some potential of extra yet to be defined properties in the future.

The way it's described here, it seems more like being used for optional, yet undefined, features, rather than extensions to an underlying API. For example, if you required the Status capability, you're pretty much stuck throwing an exception when getCapability returns null.

I guess my major complaint is simply that you can end up with "This API support X, Y, and Z and whatever else we may think up later". Hardly a guarantee.

Eventually, going reductio ad absurdum, everything will just look like: public class Thing { public Map stuff; public Object getCapability(Class t); }

I can see value in it, to a point, but, it's an awfully big out I think.

Capabilities are not only usefull in APIs (where it is a clever way of combining Adapter and Factory pattern) but also in Protocols and Service Interfaces of a (distributed) Application. For example if you look over there at the Second Life Grid Blog, the nice Folks at Linden Lab apply that to reengeneering and de-coupling their Game architecture (supporting for example multiple client versions or refactoring of central services). Gruss Bernd

Well, it enables one to do something useful, while recognizing that version 1.0 of any API is usually imperfect and new requirements will be found. If you want to keep strict backward compatibility, this gives you a way to do that and evolve the API later. For example, if the StatusTime interface proved to be something everyone needs, you'd just add a setter to TaskStatus for it in the next rev of the API. Too often object oriented design is taught as if modeling a business problem is the same as modeling a 2D point. With the 2D point, you know darned well you're never going to need to add a third coordinate. Most problems people end up modeling in code are nowhere near as cut-and-dried as that - they will evolve, new requirements will be discovered, and the API will need to evolve to reflect that. So this is one set of patterns helps to anticipate that. I don't think this is a pattern to use all over the place, otherwise you do end up in a wasteland where every object is just a thing with a magic bag of stuff hanging off it. Re throwing the exception, a caveat of using this pattern is that no code should ever depend on getCapability() returning non-null (unless the class they're calling documents that it really will never be null - in which case, that class probably ought to have the methods of the requested object itself, since clearly whatever is being asked for has graduated to first-class functionality).

> What stuff, exactly? What do I need to care about? My friend Jon points out that you can use a marker interface to narrow the scope of what can be returned, i.e. <? extends IFoo> getCapability() (actually we had a lengthy argument yesterday about this - in NetBeans we were once burned by Node.Cookie in this pattern (lots of people saying "Why do I have to depend on this huge Nodes JAR just because you're making me implement this marker interface?!"). But Jon does have a good point that, in smaller-scoped things, if you use a marker interface, an IDE will let you instantly see all the implementations of that interface, so it's easy to determine what you can and can't get.

@whartung (and, I suppose @timboudreau as well): If you continue to think of capabilities/facets/views/perspectives of an object as, well, capabilities/facets/views/perspectives *of that object*, thus implying that *that object* is somehow more the focal point than its capabilities are, then yes, this kind of thing can be an "out", as you put it. More briefly, at any point in time, an object with this capabilities escape clause can have 1..M capabilities, where M approaches infinity because I haven't thought of what they are yet. But if you look at it in *reverse*, and say that every capability has exactly one *identity*, then this same problem, expressed a little bit differently, doesn't suffer from the "out"/bag-of-stuff problem. If you treat the capabilities as the "master" objects, or the beefy objects that you're *really* interested in, and the thing-with-capabilities as the substrate they stick to, and if you disperse the capability lookup problem to each capability (after all, they may be located in different places!) then the same problem becomes a little more manageable (IMHO). What I mean is, instead of saying, for example (and God help me I hate this commenting system; I always confuse it with the, e.g., Swing forums, whose commenting system syntax is different, and I never remember WHICH IS WHICH): final VitalSigns vitalSigns = person.getCapability(VitalSigns.class); ...where you sort of hope you know or you read the documentation where it says that indeed a Person can provide a VitalSigns capability, you say: final VitalSigns vitalSigns = VitalSigns.findFor(person); ...which (to me) shows more clearly that you can scrounge up VitalSigns for a person without having to worry about whether the person even knows he is providing that capability. The first (the one that Tim is talking about) implies that an exceptionally well-constructed Person object might have all *sorts* of capabilities--maybe even hundreds of them (not likely, but that's the connotation of that API). That gives you (me) a queasy feeling, and is one of the reasons that I am always slightly wary of, if ultimately impressed by, the NetBeans APIs: I can monitor all of these Lookups for...for...for...well, lots of stuff. What stuff, exactly? What do I need to care about? The second implies (and can enforce, due to good old polymorphic method arguments) that a VitalSigns instance may be "found" (whatever that means for you) for a given person. Now, if determined, you can still get into a heap of trouble. You *could* write the VitalSigns findFor() method this way: public static VitalSigns findFor(final Object anythingAtAll) { /* dogs, cats, rocks, toothpaste, etc. */ } ...but given that you usually know the domain you're working with you'd probably write it this way: public static VitalSigns findFor(final Person person) { /* people and subclasses only */ } ...and there may even be a generics incantation involving lots of angle brackets and question marks that would allow you to restrict it to just Person instances. Back in 1997? or so? there was a company called objectspace that called this approach...Views? Facets? Tim, thanks again for these *really* interesting posts. Best, Laird

@swpalmer > Directly adding a new interface to an object may have the risk of colliding method > names with some derived class Hence the advice to make API classes final. It's been over a decade since I did Windows programming, but I don't remember COM being typesafe for this sort of thing (and wasn't it GUIDs or something?). @whartung > you still can't change the original API Agreed, you can't change things that would break compatibility. But you are also less likely to overdesign the original API to solve every corner-case in the universe, so hopefully you're less likely to *need* to break compatibility. If as I suggest, there is a migration path for capabilities (where it makes sense) to eventually migrate into the official API, it shouldn't result in an ever-growing set of capability interfaces.

It reminds me of COM's QueryInterface. I can't stand COM. :-) The pattern is useful, but it's also awkward as the number of "capabilities" increases. Directly adding a new interface to an object may have the risk of colliding method names with some derived class, but the code can be much cleaner so there is a trade-off.

My only concern, however, is that it doesn't really solve the problem (unless I'm missing something) of evolution. It lets you tack stuff on to an API, much like adding a Map to an object lets you add "properties". And, frankly, I'm all for that kind "ya know, we need to this one little thing for this one edge case used for the guys in Pacoima...". But once you have your API set, even with capabilities, you're still stuck. You can write your code against the API for v1.0, as it drifts in to v1.5, maybe a few capabilities or few de rigeur properties start showing up. Now when you want to create the new version, you still can't change the original API. You're back to "LayoutManager2". Otherwise you're stuck with a 5 year old API and a bunch of codified new functionality cursed to live as capabilities or properties forever forward. So, it does give you this "bag of stuff" that you can hang off the API, and certainly adds some flexibility, I think it kind of punts around the problem as a whole. (Mind I'm not helpful, I don't really have a solution, just musing about what rubs me the wrong way with what we have.)