Skip to main content

Mad Metadata Plan

Posted by jive on July 15, 2005 at 4:21 PM PDT

The use of Extensible-Interface pattern for an origional take on the metadata problem plaguing the spatial world (see EOGEO for background).

Thanks to those at OSG'05 for the inspiration. Now if only someone will pay me to solve this problem :-)


A couple people have asked about my mad metadata plan (tm). Since it is a Friday, and other mad plans are flowing around the email lists I thought I should play....

Briefly:

  • metadata is by definition useless
  • metadata is by definition useless for what you are doing right now
  • metadata is *used* by other processes, downstream as it were
  • metadata is used provide additional information to effect or control later process
  • metadata is open ended
  • metadata is also standard
  • metadata is standardized for a few well-known processes (such as discovery)
  • metadata standards will be created in the future (for new uses)

Now in the XML world metadata is relatively easy:

  • use the document to hold the metadata
  • when working with a standard a schema is available to constrain your document
  • use XPath to extract out needed bits of information
  • where a standard exists it can be used to construct your XPath expression
  • where a couple standards exist use XSLT to transform between them
  • when new standards come along in the future they can be handled in a similar fashion, using similar tools
  • for spatial metadata we can use Filter to perform the usual sorts of queries, where "attributes" are extracted out via XPath

In the object oriented world metadata has given us a little bit of grief:

  • we like to work with interfaces
  • and support multiple implementations
  • we have some nice ISO 19115 interfaces for our spatial information, based on a standardized schema
  • we have a history of thinking about Dublin Core
    (forms the basis of the GeoServer FeatureTypeInfo and uDig GeoResourceInfo interfaces)

  • services (ISO 19119) resist being turned into interfaces as no schema is readily available

As for specification we have a few:

  • OGC CS-W is generic
  • it just provides the work flow of request and response

  • An ISO profile exists based on ISO 19115 & ISO 19119 ideas, split into Data, Services and Associations between the two
    (These are being deployed in Europe based on Deegree code)

  • A ebRIM profile is around, but to exact instructions on how to accomplish anything with it are still forthcoming
  • GeoNetwork (http://www.eogeo.org/Workshops/EOGEO2005/7-eogeo2005-ticheler-abstract) has an implementation we also

want to play with (ISO 19115 and ISO 23950 (Z39.50)

So here is the start of the mad metadata plan:

  • we want to be object-oriented, but we have an open set of standards to conform to?
  • and we tend to capture these standards as interfaces?

The solution is to use the *Extensible Interface* pattern...


QWhat is the Extensible Interface Pattern?

AKA:Extensible Object, IAdaptable (Eclipse), IResolve (udig)

Intent:"Anticipate that an object's interface needs to be extended in the future. Extension Object lets you add interface to a class and lets clients query whether an object has a particular extension."

Q:How is that done?

Object obj.getExtention( Class extension )

Where null is returned if the requested interface is not available

Q:Why is that cool?

Because of the part I did not tell you yet, the implementation of getExtention should be backed by a *Factory*, and not just any factory one that can be extended by a plug-in system.

In geotools an example is the use of FactoryFinder and DataStoreFactory.

This lets us teach an old object new tricks.


This lets us accomplish something very cool, it lets us have our metadata Object *in the format it arrived in* (backed by a DOM, or a JDOM or by a cluster of Objects), and it gives us a method to call to ask for that data in an interface known to us.

Better yet it lets the people capturing the information not have to worry about ISO19115 or ISO19119 or ebRIM, such mappings can be handled by someone else.

What would these mappings look like? Well if it was a simple XML problem we would provide some of this with XPath expressions, if we use JXPath (the apache project) we can use XPath for both DOM and clusters of Objects/Collections (aka POJO). If we play our cards right we can make the mappings completely orthogonal to the metadata storage facility.

Basically it should only matter that we know how to go from ISO19115 to DublinCore, not if we are using the geoapi ISO 19115 interfaces, or a XML document with the ISO19115 schema.

Very cool.

Okay a few more bits of the puzzle

  • the ebRIM camp seems to provide the concept of "slots" for information that is useful enough to bother using an OGC Filter against. These slots are often defined via XPath mappings, and any grouping of slots I have ever seen start off with Dublin Core as the basis.

    Q So what is an implementor to do?

    • XML types just hack with XPath, they can reuse their OGC Filter code with out a lot of trouble
    • DeeGree made their Metadata implement Feature (so the could use Filter on it)

    Sounds good to me, our Metadata object should *be* a Feature, the FeatureType can capture the available *slots* as attributes. Usually these are a superset of those defined by DublinCore.

    Putting the bits together:

    • Extensible Interface pattern for our Metadata "Object"
    • Factory pattern used to supply additional Mappings to new Interfaces
    • An implementation of these Mappings should use XPath, and be applicable to XML and POJO
    • Object isA Feature, FeatureType usually derived from DublinCore

    We get a system that can pass extra data through, can be used with OGC Filter, allows the use of our existing Metadata interfaces for ISO 19115, and can be taught new tricks.

    Jody