GeoServer WFS-T
What makes GeoServer Special
Recently there has been a lot of press about MapServer, everything from
MapServer Junior to the formation of a Foundation. Welcome to the other
team, the Java team, and here is what GeoServer does so well - WFS-T.
For those new to the Open Geospatial Consortium standards scene there are a couple of levels of "compiliancy".
- WFS Compliant - Supports GetCapabilities, GetFeatures, DescribeFeatureType
- WFS-T Compliant - Supports Transaction (aka allows you to modify information)
There are a couple of leftover things, like LockFeature and
GetFeatureWithLock - they don't get their abbrevation. Rest assured,
GeoServer does those as well.
MapServer gets Ugly
One thing that both GeoServer and MapServer products share is a
fair bit of "ugly". MapServer starts with an advantage in this regard,
as it is hard for us to compete with pointers.
We have had occasion to look at the MapServer codebase, for label positioning code I think, at the time we were
relieved. It was an intricate, ornate construction, where the
slightest tweak would have far flung consequences. What
MapServer Classic (I mean Cheeta) has going for it is the open
source advantage, it has been deployed, and tweaked, so often that it
shines. Shines in this case means speed.
The only way to really clean it up would be to start again from the same
building blocks, something we did not think they would do unless a
major sponsor came around. Enter AutoDesk ...
I suppose that means we will need to work for a living ...
GeoServer gets Ugly
Okay I will fess up, the following code is 98% my fault. At this point
I will pass the keyboard over to Richard Gould as he intro's the 800 line function of Doom.
Richad Gould writes:
In January of 2004 Jody and I were
working on the Validation Web Feature Service aspect of GeoServer. More
specifically Jody was working on that, and I was working on GeoServer's web
configuration interface. The day before the project was due, Jody
asks me for help. After two weeks of failed debugging, Jody decided we
should re-write the WFS transaction code. Let me first say that before
starting this work on GeoServer, I had never even heard of GIS before.
I didn't even really understand what a Feature was. So for the next 17.5
hours, reaching until 5:30 am, I ran around absolutely clueless of the
big (or even small) picture. Somehow, under the guidance of Jody, we
came up with the biggest method of Java I have ever seen, and it worked.
These 800 lines of code almost made Chris Holmes give up on open
source! I have not touched the code since, but I fear it is still
running about like some titanic bull.
I can't ask for a better intro then that ... thanks Richard the truth is supposed to hurt right?
Transaction - The Method Behind the Madness
Yes there actually was a plan, it involved introducing test cases to
the codebase so we could tell when we were done. Up until this time GeoServer sufficed on a diet of Cite
tests which only Chris Holmes managed to reliably run. Fixes involved
poking the code, and then pinging an external test suite (called Cite)
and asking it to perform 400 odd tests. Wash, rinse, repeat.
Oh and the method name is TransactionResponse what does it do? Well it kinds of does everything, it is the only method in the WFS patagon that changes anything.
But lets start with what a WFS Transaction is supposed to process...here is an outline of a Transaction Request:
<?xmlversion="1.0"?>
|
From this we can see a number of interesting details:
- Insert
- Delete
- Update
- Locking/Unlocking
In response to this we need to return a TransactionResponse
document, that the "Feature ID" of any features added ... and how
successful we were. Yes that sounds messed up, we could be SUCCESS,
FAILURE, or PARTIAL at the end of the day. I am sure PARTIAL is just a
quick hack, not sure why the standards body thought this one up,
we went to a lot
of trouble not to support this feature.
As long as we are going to trouble, we had some more features to consider:
- Writability of DataSources
- Restrictions on each of the kinds of operations
- Validation
You may note that the last entry, Validation, was the purpose of the
Validating Web Feature Server Project we were working on - so we had a vested interest in getting this method to work.
Validation is grouped into two areas of applicability:
- Feature Validation - per feature tests, can be applied before an insert, or after an update
- Integrity Validation - tests that take in the big picture, checked before commiting all the changes off to the data sources
That is it, what we need to do, now for how we did it.
TransactionResponse - GeoServer in a Nutshell
Welcome to TransactionResponse, the only class you need to know
to "understand" GeoServer, first of all let's see what this class has
to say for itself...
From TransactionResponse.java:
/**
|
Thanks Chris, you're a pal. Let's try again. GeoServer *is* well
documented, just wander up the class heirarchy and you will find
information somewhere.
Javadocs from Response:
|
The Response interface serves as a common denominator for all service
The work flow for this kind of objects is divided in two parts: the first is
Note: abort() will be called as part of error handling giving your
This is specially usefull for streamed responses such as wfs GetFeature or
|
Thanks Gabriel, note Gabriel is from Spain and actually has interesting letters in his name - he is not trying to be leet.
So let's explain what is going on:
- The Transaction class receives a HTTPRequest, its super class WFSService checks enablement and handles errors, and the super class AbstractService is a HttpServlet and does magic.
- Magic sounds like a framework, the GeoServer framework, lets look at what AbstractService does for you:
- get a Request reader
- ask the RequestReader for the Request object
- initialize the resulting Request object with the ServletRequest
- get the appropriate ResponseHandler for the Request
- set the http response's content type
- write to the http response's output stream
- call Response cleanup
- if anything goes wrong produce a ServiceException and write it out instead
- The fun thing is GeoServer has a couple of "strategies" for doing
this, SPEED, FILE, BUFFER with different speed vs performance tradeoffs.
So Gabriel's javadocs do make sense, we need to be able to execute the Transaction operation, indicate the contentType, and finally produce a TransactionDocument so we can writeTo the provided OutputStream.
Just incase you were wondering, a new TransactionResponse object is created for each request - we don't have to worry about concurrent access to this object.
TransactionResponse.execute
To start with, let's have a look at the method signature:
protected void execute(Request transactionRequest)
|
The TransactionRequest contains the information (the Transaction
Servlet produced this object from the HttpRequest for us). We will get
back to this object later.
Exceptions during Response Handling
Lets consider the exceptions:
- ServiceException - this is an Exception that knows how to write
itself out to the output stream as a proper WFSServiceException
document. That is about all I know, the javadocs still contain my
best guess of what the parameters mean. - WfsException - this is an Exception that knows how to write itself out as a TransactionResponse with success of type FAILURE
So let's consider when to use these two: ServiceException should be
used when it is GeoServer's fault that things are bad (perhaps the
database is down?), WfsException should be used when it is the user's
fault that things are bad (perhaps they are trying to modify a Feature
that is locked?).
The ability to throw exceptions, and get out of normal processing when things went bad, really helped make this code clear.
Finding the GeoServer Module
And now for the implementation:
if (!(request instanceof TransactionRequest)) {
|
So we ensure that this really is a TransactionRequest, stranger things have happened and good error messages do pay off.
The "service level" check warrants further study, and some
background information. GeoServer is arranged as a number of
independent "modules". These modules float around in the
ApplicationContainer knowing as little about one another as possible.
Struts is currently used to set up these modules and dump them in the
servlet container in an orderly fashion.
The modules are:
- Data - holds all the fun data stuff, in uDig this would be the "LocalCatalog"
- WFS - care and feeding of the Web Feature Server servlettes and configuration
- WMS - same idea for the Web Map Server
The interesting thing is that each model can be run against a different set of data, indeed you could load up a couple of WFS modules
with different configuration, and data access, on a user by user basis.
Back to reality, the Request must be consulted to locate the modules because the request is aware of the ApplicationContainer.
The WFS.getServiceLevel() is a magic integer bitmask that indicates
what WFS operations are "turned on" - the bit WFSDTO.TRANSACTIONAL
indicates that GeoServer is configured as a WFS-T.
Reading between the lines you can recognize that configuration is
handled via Data Transfer Objects (DTO) that are written to and
from your GeoServer data directory as XML. We hope to be able to store
these objects with a configuration service such as JMX.
Now that the sanity checks are out of the way lets do some work:
//REVISIT: this should maybe integrate with the other exception
|
As you can see, we believe in commenting code out, with long
explanations, rather then trusting version control. As for
getting to work, I lied, we isolated work into the ...
800 line method of Doom
And here we are:
| protected void execute(TransactionRequest transactionRequest) throws ServiceException, WfsException { ... } |
Now we are sure that a) We have a TransactionRequest and that b) we passed configuration checks to get here.
Let's see what javadocs have to say for themseleves:
|
That is better then a poke in the eye with a sharp stick, it even has
documented some member variables that will be used to communicate with writeTo
when generating a TransactionResponse document. We will discuss the
member variables as they are encountered in the implementation.
Setting up
Well to start out with we need a couple of things:
request = transactionRequest; // preserved toWrite() handle access
|
We are saving the request for later. Each request has an optional
"handle" that may be specified by the user. This handle is supposed to
be used for error reporting. On the off chance we error out during
writeTo() we need report this information.
GeoTools offers cross data store
transaction support. This was constructed explictly for the GeoServer
application and this method. To make use of the facility we will need
to make ourselves a transaction, the default implementation will
work just fine.
It is a little known fact that the default constructor
for DefaultTransaction constructs a stack trace and uses it to
determine what method started the transaction. This is used during
error reporting to make your life easier. DefaultTransaction also
supports a constructor where a handle is provided, so we really should be using the handle provided by the user.
TODO: transaction = request.getHandle() == null ? new DefaultTransaction() : new DefaultTransaction( request.getHandle() );
Finally GeoServer makes use of the java.util.logging facilities, we
used Log4j for the longest time and you may still find some references
in the codebase.
Data Access
You already know about the different modules, to make this literal the module we want is called Data:
Data catalog = transactionRequest.getWFS().getData();
|
The difference between literal programming and literate programming is
almost the point of object oriented programming. And my spelling was
never that good...
WfsTransResponse
You may remember that writeTo needs to send out a document later in the day. This object is going to "collect" all the information needed.
WfsTransResponse build = new WfsTransResponse(WfsTransResponse.SUCCESS,
|
This brings up another interesting configuration idea, GeoServer supports a verbose mode where it gets extra happy with reporting of details.
The happiness comes from including stack traces as part of a ServiceException document.
Minding the (Data) Stores
Okay this time we really will hunt down some data. GeoTools makes use of a high level data acess API called FeatureSource.
Rather than including both read and write methods in this API (and
having half of them throw exceptions when writing is not an option), we
have broken the idea up into two classes. The FeatureStore interface extends FeatureSource with methods required for data modification.
First some book keeping allowing us to validate, and clean up, on the off chance we accomplish something.
//
|
So we have two maps, and two ways of referring to a FeatureStore:
- typeName - this is the name of the FeatureType,
all Features produced by the FeatureSource will conform to a schema
with this typeName. The typeName is used (along with a namespace) when
writing the content out as XML. - typeRef - typeRef is something I made up combining the two bits of information we need to locate a FeatureSource. Inside the Data module information is organized into DataStore objects, each of which is given a dataStoreId. You can use a DataStore to obtain a FeatureSource if you have a typeName.
Okay you caught me out, the above also talks about the GeoTools DataStore class. DataStore is
the low-level data access API, providing access to an entire DataBase
(or file at a time). It allows for low-level feature by feature
access, it also provides a list of typeName for the content it knows about.
Rule of thumb:
- DataStore == a database, or a shapefile
- FeatureSource / FeatureStore == a table, or the contents of a shapefile
Note that because FeatureStore
is a high-level API it is much easier to use, and optimized for common
activities - often generating direct SQL statements rather then
dragging everything into Java for processing.
PreProcessing TransactionRequest
Now it is time to start picking apart our transaction request with the following goal in mind:
- Figure out what FeatureStores are going to be modifed by this opperation
- Set them up to work on our transaction
Tally-ho:
// Gather FeatureStores required by Transaction Elements
|
In order to proceed we are going to two things: the typeName and the FeatyreTypeInfo (the elementName is just used for error reporting).
So now it is time to explain about FeatureTypeInfo -
this class captures everything that GeoServer knows about the data.
Both the information on how to connect to the data, and the
configuration supplied by the user. This information goes beyond what
can be determined by inspecting the data source itself (for
example GeoServer allows you to configure a global bounding box
for the data).
This metadata information is really simple, based roughly on Dublin
Core. It is mostly enough information to generate the capabilities
document. We have ported this information to GeoTools and uDig. The
latest implementation of this idea is part of the GeoTools catalog
package. GeoServer is still using the original at this time.
Processing an InsertRequest:
if (element instanceof InsertRequest) {
|
Here we have a bit of a gap in GeoServer right now. Our ability to
parse GML during a TransactionInsert is based on a SAX parser and does
a pretty blind job of it (not taking the known FeatureType information
into account). The GeoTools FeatureType construct maintains a concept of "namespace", and the DataStore's keep track of the FeatureTypes known to them. You can see us doing a lookup in the catalog to determine how we would
write out content of this type, and modifying the original element to agree with our assumptions.
The other two request types are easier, no GML needs be harmed:
| else { // Option 2: lookup based on elmentName (assume prefix:typeName) typeRef = null; // unknown at this time elementName = element.getTypeName(); if( stores.containsKey( elementName )) { LOGGER.finer("FeatureSource '"+elementName+"' already loaded." ); continue; } LOGGER.fine("Locating FeatureSource '"+elementName+"'..."); meta = catalog.getFeatureTypeInfo(elementName); element.setTypeName( meta.getNameSpace().getPrefix()+":"+meta.getTypeName() ); } |
Here we can just figure out the what type is referenced. We do ignore
the prefix information - and at home we wont have a conflict. Once
again we mangle the element to agree with our concept of prefix.
And now the fun stuff finding the data:
typeRef = meta.getDataStoreInfo().getId()+":"+meta.getTypeName();
|
Now that we have located the correct FeatureTypeInfo (aka meta) we can
figure out a typeRef and elementName to use. After a quick check to see
if we have already located it - we can start the lookup process.
A helper method of FeatureTypeInfo actually cuts to the chase -
getFeatureSource() will create us a new FeatureSource all set up and
ready to go. We now have a couple of checks. If the FeatureSource is
not available (IOException) or writable (instanceof FeatureStore)
we need to throw a WfsTrasactionException to let the user know.
We can then arange the feature source into our two book keeping maps for later use.
(Un)Locking
The WFS specification has an interesting locking system. It actually
represents a compromise between "strong transaction support" (that
lasts between sessions), and something simple enough to be implemented.
// provide authorization for transaction
|
Yes the entire method from this point forward goes modal from here on
out. If authorizationId != null we are dealing with locks. What do we
do with the authroizationID? We feed to to the transaction and stand
back and enjoy the show (the various DataStores will check for this
authorization Id as needed).
The above does contain a small mistake, these are long term transaction
locks - that are not always maintained by GeoServer. If GeoSever is
restarted it will not have a memory of locks already in use by an
external Database (indeed the lock may have been obtained with an
application other then GeoServer).
TODO: if (!catalog.lockExists(authorizationID)) { LOGGER.warn( "Not locked by this instanceof GeoServer" ); }
Of course we will wait for a bug report to come in on this one.
Transaction Processing
Now that we have all the setup we could ever imagine, it is time to get down to processing the individual elements:
// execute elements in order,
|
Once again we have brought together all the information needed - this
time for an individual element. Of interest is obtaining the local
variable store via a lookup to the FeatureStores we already collected during preprocessing.
DeleteRequest Element
Lets comence with the cerimonial sanity checks:
if (element instanceof DeleteRequest) {
|
After checking the WFS configuration to ensure that this user is
authroized to perform SERVICE_DELETE we can setting our self up
with a DeleteRequest. The WFS specification requires that a Filter be
provided, since we are not validating against the schema, we will need
to explicitly check for this ourseleves - producing a ServiceException
when in error.
PreProcessing Validation Hints
Now that things are getting serious (with real data access) we are
going to break out a try/catch block as IOExceptions become a fact of
life:
try {
|
In this initial stretching excercise we are trying to figure out what
area will be damaged (ie. modified) by the delete opperatio.
This information gathered so we can limit the scope of any validation
checks to be performed after the fact. We need to do this check before
the change takes place (because after words the content will not be
there to check).
TODO: If no validation is needed we could skip this pass through the data!
Delete with ReleaseAction.SOME
Moving on - we get the first of our Lock checks:
if ((request.getLockId() != null)
|
Then we are due for some dream time:
// TODO: Revisit Lock/Delete interaction in gt2
|
This is way the game is ment to be played, simple direct readable - but wrong.
And now for reality:
else {
|
Reality is a dark and gloomy place. To start with we backtrac up to the DataStore and get a low-level FeatureWriter. The FeatureWriter interface works like an Iterator that throws IOExceptions. Like a ListIterator it allows content to be remove()ed
- hense our interest. Since FeatureWriter throws IOExceptions we always
need to make use of a try/finally block or risk disaster.
TODO: Figure out why store.removeFeatures( filter ) is there, if FeatureWriter is doing its job this line would not do anything
Normal Delete (or Delete with ReleaseAction ALL )
This is much easier:
else {
|
Yes working with out locks is much easier, concurancy always has a price.
Delete Element Cleanup
A little bit of book keeping and we are done:
envelope.expandToInclude(damaged);
|
We exampand the member field envelope to include the area damaged by this element. The envelope will be used to limit validation checking later.
So by the time we are done, the content has been removed (but the
transaction is not commited yet). We have record an envelope describing
where the change occured. If we were doing ReleaseAction.SOME we have
carefully released locks on only those features actually deleted.
Insert Element Processing
Processing the insert element is technically the most risky proposition
- because it involves parsing GML content. While that does represent
plenty of interesting chalanages - it is not the subject of this
article. For more information please look at how the
TransactionRequest object is constructued.
Ritual security check insues:
if (element instanceof InsertRequest) {
|
No surprises there, with a little try/catch block we can move onto the real work:
try {
|
The GML content has already been parsed into a FeatureCollection for
us, DataUtilities can adapat this to a FeatureReader for later
use.
Now we can look up enough information to perform our first validation check:
// Need to use the namespace here for the lookup, due to our weird
|
The featureValidation( dataStoreId, schema, collection ) method will
figure out what validation tests can be run right away. The content is
checked before making it anywhere near the data source!
Now we can finally get down to inserting the features:
Set fids = store.addFeatures(reader);
|
The GeoTools addFeatures method
will return a Set of the FeatureIds of the newly created features. This
is not quite ideal - it would of been kind to return them in order with
a List. This information is important as we need it in order to
create our TransactionResponse document, the local variable build is gathering up this information for later. Finally we maintain that envelope for use in later checks.
TODO: Use a List of FeatureIds so response is returned in order of creation.
All in all this is more straight forward then deleting.
Update Element Processing
So what do we do when we have the complexities of checking locks, along
with the joy of parsing? The answer is contained in the depths of the
processing the update Element
Ritualistic security check (almost makes me wish for Aspects - hint hint):
if (element instanceof UpdateRequest) {
|
PreProcessing Validation Hints
Now we can start by gathering up the information needed to make a query:
try {
|
And yes Query was ment in the literal DefaultQuery sort of way, we are only requesting the values that are going to get modified.
Why would we do this? Because we are going to remember the bounds for later, and also which exact features were harmed:
// Pass through data to collect fids and damaged region
|
This is a straightforward pass through the data.
TODO: If no validation is needed we could skip this.
Update Features
We can now proceed with the update, since the highlevel FeatureSource
API was created with this method in mind the process is straight
forward:
|
We even snuck in an optimization when only a single property is
updated. And then we run into a surprise - thowing a
WfsException is supposed to be sufficient. There should be no need to
construct a special WfsTransResponse your self.
TODO: throw new WfsException( e ) and flush out the bugs that must of prevented cite tests from passing
Unlocking The Modifed Features
A bit more fun here, we need to unlock the modified feautres if TransactionResponse is SOME:
if ((request.getLockId() != null)
|
We have what looks to be another obscure bug (anything w/ locking is
obscure). The problem: filter may not return the exact same
features as before the modification was made.
TODO: Construct a new FidFilter from fids (ie the list we made for validation checking)
Validation on Updated Content
Now that we have modifed some features we may as well check if they are any good:
// Post process - check features for changed boundary and
|
This time around we can see a FidFilter being created, we construct a
FeatureCollection in the usual manner and send it off to the
featureValidation method for review. It should be noted that the
featureValidation method will happly throw IOException if somebody is
not behaving well, causing the Transaction to be rolled back and the
data left in a consistent state.
Cleaning up after the Update
A little house keeping and we are done:
} catch (IOException ioException) {
|
// All opperations have worked thus far
|
After the validation check we set the field response to the build object we have been carefully constructing.
Writing out the TransactionDocument
Okay you have survived the 800 line method of doom, why am I still
talking? Because our result has not been a) commited or b) sent off to
the client. I hate cliff hanger endings and their is a lot of data
lurking on the edge at this point in the story.
If you were paying attention to the saga of ApplicationService we
got one more responsibility before we get around to writing out content:
public String getContentType(GeoServer gs) {
|
That is right, the content type is completly defined by the configuration.
writeTo
And now for the good stuff:
/**
|
So the writeTo method get to both commit the content, generate the Transaction Document and release any locks.
May as well get started:
if ((transaction == null) || (response == null)) {
|
If we have not started yet (ie execture) has not been called, or
if some developer is trying to support the PARTIAL response type it is
time to die.
Writing is straightforward:
try {
|
Good thing that response knows how to write itself out.
Now we get to the heart of the application:
switch (response.status) {
|
I am glad to see so much work looking so simple at the end of things - great work everyone!
A little bit of error wrangling and we are done:
} catch (IOException ioException) {
|
Un Locking
Okay we are not quite done, here is a little bit of Lock cleanup.
//
|
We are foced to ask the Data module for enough information to clean up
after locks - a single Lock may be used on more then one DataStore. The
DataModule is the only class that "knows" about the DataStores.
This facility is available through the GeoTools Related Topics >>





