Mapping Entities to REST - Learning from History
I'm working with a team that is working on providing tooling in NetBeans that lets you map database entities to REST resources. At first blush it looks like a very nice, simple mapping. REST is about resources. On each resource you can perform basic CRUD (Create, Read, Update, Delete) operations using the basic HTTP verbs POST, GET, PUT and DELETE.
So it seems pretty obvious: map each table in your schema to a REST resource. POST inserts a record into the table. GET with query parameters allows you to perform various types of SELECT operations. PUT does a SQL UPDATE, and DELETE does a SQL DELETE. Very nice, and now you can expose your data to web "mashup" applications.
There are some details on how you handle large data sets using pagination, how you identify individual records, how you handle conflicts, and so on. But this can be worked out, and the
Atom Publishing Protocol provides a great framework for a lot of this.
Well, not so fast...
I was working at Sybase in the late eighties. Sybase was taking off because it had introduced some revolutionary architectures and technologies that significantly improved performance and maintainability of database applications.
One major invention was procedural SQL and stored procedures. Stored procedures introduced the following key benefits:
- Your SQL was pre-compiled and stored in the server, so it did not need to be interpreted each time
- You could run complex procedural logic over your data in the same process as the database. This reduced the amount of data that was shipped to the client and the overall number of network round-trips you had to perform to accomplish a task.
- You could centralize business logic in the server. You didn't have to make sure each client enforced business rules consistently
- Stored procedures provide a layer between the database schema and the interface used by your applications. This allows you to modify the database schema (e.g. for performance optimizations) without breaking all your applications.
Not everyone uses stored procedures these days, but for a large reason that's because of the web architecture. This architecture gives you a natural place to have a "business logic" tier that provides many of the benefits of stored procedures, with the added benefit that you can write your logic in a language that was meant to be procedural, like Java or C++ or Ruby. Writing procedural logic in SQL (and I speak from experience) is a somewhat unnatural act, and what you end up with in general is not very easy to understand or maintain. These days, many businesses use stored procedures sparingly, and mostly to improve performance.
What does all this history have to do with mapping entities to REST? Well, if you're not careful, you completely short-circuit the business tier, and you can find yourself transported back to the early eighties:
- Web clients become responsible for pulling down the necessary data and performing procedural logic on them. That means more data going over the network and more network hops. I got an example of this from a colleague who is trying to work with the Flickr API:
Flickr has a frustrating API. Often I have to make 3 or 4 calls to flickr to get the data I want. This has led to several seconds worth of delay in the client, which makes for a frustrating user experience. We're doing all kinds of background loading and caching to try to work around the problem.
- If you want to perform an operation over two or more resources (tables) as a single unit of work, it's basically impossible. There are no transactional semantics for an HTTP client - each request needs to be its own unit of work. You could maybe devise a transactional API using cookies, but I wouldn't advise it.
- You have locked your database schema to your web interface when you do a direct mapping between entities and REST resources. This makes it very hard to migrate your database schema over time.
- Web clients become responsible for enforcing business rules, since you're going directly to your database schema. Given that you want to support third-party web clients from who knows where or whom, this is probably a bad idea. OK, a Very Bad Idea.
Basically, you have gone back to the pre-stored-procedure, pre-web days of database applications.
So, what do we do? Do we toss this idea of mapping entities to REST resources? Well, I'm not so sure. There is a lot of value in a tool that quickly exposes your database to the web via REST.
But this tool needs to let you, the user, inject your business logic in the web tier where it makes sense. You should be able to create a REST resource that rather than going directly to an entity invokes a method that runs procedural logic on top of the database. If you look at the new REST JSR, this is basically what this is about, and it looks very cool.
It could also be very useful to be able to map SQL views and stored procedures to REST resources. This is another way to accomplish the same thing: providing a layer of abstraction and control between your web clients and the database.
Your thoughts and feedback on this are much appreciated. A lot of people want to expose their data on their web, and many already are; it would be good to provide developers a set of best practices along with tools to make this possible and easy without leading to slow and brittle application architectures.