Skip to main content

The Observation API (hey, it's not the Observable pattern)

Posted by fabriziogiudici on April 29, 2009 at 7:25 PM EDT
As I said in February, I'm going to post on my blog a series about the use of Semantic Technologies and how they are being used in some of my projects.

Today let's start giving a small domain model and then design a related abstract API. In the next post I'll show you how to implement it on the top of a RDF triple store.

Models

The API is designed to manage a set of "observations" (of whatever, by anybody, etc... let's keep it generic in order to respect the "AAA slogan": Anyone can say Anything about Any topic), so I'm calling it "the Observation API". 

Let's introduce the key abstractions:
  • An Observer represents entities capable to make observations. It may be a physical person, or a device, or anything else makes sense.
  • An Observation is made at a certain time and Location, by one or more Observers and is composed of one or more ObservationItems.
  • An ObservationItem pairs an Observable with a Cardinality.
  • An Observable has no special properties and can be anything.
  • A Cardinality may be a single integer number, a close or open range of integers (e.g. "between 10 and 20", or "more than 5"), or "undefined".
  • An ObservationSet is a set of Observations.
  • A Source is where a certain datum comes from (i.e. who or what provided it, inserting into the database).
Domain Model

All of the above concepts are going to become classes, so I'm now referring to them with the code typo convention. Observer, Observable, Location and Source have a single property exposed through getDisplayName(), that is the identifier used for rendering the object in a UI. Applying some common patterns, we add a few more classes for the solution model:
  • An ObservationManager provides way to retrieve and create ObservationSets.
  • A Finder is used to perform various queries.
  • An Observation.Builder provides ways to create new Observations.
The only architectural dependency introduced in the API is related to the use of components à la NetBeans Platform, whose lookup logic is hidden behind a ObservationManager.Locator. As annotations has been recently introduced for dependency injection in the Platform, this class will be removed at some point in future, making the API mostly technology-agnostic.

UML Diagram
(Click on the above diagram for a larger version)

Scenarios and examples

While I have anticipated that this API is very generic, for the following scenarios I'm giving the context of birding (or birdwatching) - which BTW is the reason for which I've designed it.

First of all, code dealing with the Observation API shoud create an ObservationSet:
ObservationManager observationManager = ObservationManager.Locator.findObservationManager();
ObservationSet temporarySet = observationManager.createObservationSet();
// or
ObservationSet persistentSet = observationManager.createObservationSet(backingStore); // backingStore is a generic Object, architecture dependent
ObservationManager provides a convenience method findOrCreate() for creating various entities of the API, even though they can be also instantiated in other ways - let's remember that things such as Observer, Observable, Location and Source are pretty generic stuff, and their implementation could come from some other APIs. This is a very important point: in a real application I don't expect ObservationManager.findOrCreate() to be really used alone, if not for ancillary, simple entities, because in the real world I'll mostly need some concrete entities with their own properties and behaviour. But the simple use of this method is ok for tests (and examples).



Here it is how I could create a new Observation of about 100-150 flamingoes and 23 spoonbills:
Date when = dateFormat.parse("29-04-2007");
Location castiglione = observationSet.findOrCreate(Location.class, "Castiglione della Pescaia", EmptyInitializer.instance());
Observer fabrizio = observationSet.findOrCreate(Observer.class, "Fabrizio Giudici", EmptyInitializer.instance());
Observable flamingo = observationSet.findOrCreate(Observable.class, "Flamingo", EmptyInitializer.instance());
Observable spoonbill = observationSet.findOrCreate(Observable.class, "Spoonbill", EmptyInitializer.instance());

Observation observation = observationSet.createObservation().
date(when).
location(castiglione).
item(spoonbill, Cardinality.valueOf(23)).
item(flamingo, Cardinality.rangeOf(100, 150)).
observer(fabrizio).
build();
Note how a fluent interface has been used for specifying all the data items for building an Observation. If this operation is made against a persistence ObservationSet, the inserted data are made persistent too (eventually at the next transaction commit; details are opaque to the API). The only required parameter for findOrCreate(), besides the entity type, is a string which represents the display name of the entity; if an entity of the given type and display name already exists, it will be returned and not created. A third argoment is an implementation of the interface:
public interface Initializer<T> 
  {
    public void initialize (T entity);
  }
that is responsible for initializing the entity status if it is created from scratch; this will be discussed in a future post, in the meantime you just need to know that EmptyInitializer.instance() returns a "no op" initializer that does nothing.

Now, how to perform queries? The starting point is the ObservationSet of course:

List<Observable> observables = observationSet.find(Observable.class).
                                              sort(SortCriterion.DISPLAY_NAME).
                                              from(100).
                                              max(20).
                                              results();


ObservationSet.find(Class<T>) returns a Finder<T>, which encapsulates the query logic. It also provides a fluent interface to specify some generic (optional) parameters of the query; in the above example the sort criterion and the "paging" of results (from the 100th, for a max of 20 items), which is easily integrable in user interfaces supporting paging. You can specify criterion for a few generic attributes of an Observation (e.g.  DISPLAY_NAME, DATE, LOCATION, OBSERVABLE). There are no more things to specify here as we are still dealing with generic entities. 
Finder
s are used in other places of the API too: for instance, you can do:

Observable observable = ...;
Finder<ObservationItem> finder = observable.findObservationItems();

or
Location location = ...;
Finder<ObservationItem> finder = location.findObservationItems();
that is, you can query single entities for the related concepts.

Enough for today. Some resources:

Comments

It would make sense to have a range for the Date. Let's say I didn't choose it for the first iterations, even though the application using this code is manipulating some thousands records and, up to the moment, there is not a strict need for it. But of course, at the moment I get JSR-310 in (see comments above) I could use the classes that represents an interval of time.

Hi Fabrizio, a natural (at least in my head) extension of the model could be using a range instead of a date; am I the first fool that thinks of it - maybe because I have completely misunderstood your intent - or have you explicitly decided to keep it this way? I'd like to hear the logic beyond this choice ps: great job... as always!

Of course. Thanks for the correction.

> > but while I'm confident with JodaTime and soon with JSR 311 > You meant JSR-310 : Date and Time API

Hi Stephen. My idea is not to use Date any longer (it has disappeared by all my paid projects, of course), but while I'm confident with JodaTime and soon with JSR 311, at the moment I don't know how to persist a date different than Date in the RDF store I've decided to use. I'm pretty sure it's possible, but I've left that for a future iteration. These APIs won't be frozen soon.

Since your API takes in a Date instance to define the date, your API may well yield different results depending on the time zone of the user running the code (perhaps persisted in one time zone and attempted to be read in another).