Skip to main content

Freebase: free-wheelin' semantic web

Posted by davidvc on March 9, 2007 at 6:49 PM PST

We all love metatagging sites like and Flickr because they let us organize our data however we want.

But if I want to write a program that lets me in some controlled way query, filter, and sort ala SQL, metatagging is pretty difficult. That's why the Semantic Web is so attractive.

The problem with the Semantic Web is that you have to build an ontology, and most of us can't be bothered, to be honest. And how do you share these ontologies, and evolve them, in a way that is accessible for as many users as possible? We want the flexibility and usability of metatagging with the programmatic power of a structured schema.

Enter freebase. Right now it's in alpha, and you have to apply to play around with it. Maybe they'll let me in. But for now you can read Tim O'Reilly's blog to get a mouth-watering taste of what they're doing.

As a database guy who understands the power of SQL, but also values the free-form and evolving nature of metatagging and a read-write web, this looks like a very good fit, and I definitely want to see more. If you combine this with an easy query-building interface, kind of like Yahoo! Pipes*, then this gets serious fast.

Reality check: we have to see how all this is going to scale. It is going to have follow the basic principles of REST, for starters. There is the additional scaling challenge of handling joins, grouping and so on, which traditionally require all data to be in a single place before you can do the operation. When you are operating across multiple, perhaps hundreds, of data sources, this can be a real issue.

The general rule for improving client/server query performance is to keep the code as close to the data as possible. I can see people quickly realizing they need to create some form of "stored procedure" that can execute filtering and other logic against the data at the source before all the data is shipped over the web. SalesForce realized this and recently extended their web service query language to support procedural logic.

But I am hopeful we can solve these scaling challenges -- there are a lot of smart people out there, some backed with lots of money who want to see this happen.

So, stay tuned. It looks like database application development is about to meet the web, and the combination will enable a completely new way to build applications.

*well, something like it: Yahoo! Pipes does this for RSS feeds, not structured content