The Source for Java Technology Collaboration
User: Password:



David Van Couvering 's Blog

February 2007 Archives


Multi-gig databases in Java DB? You Betcha

Posted by davidvc on February 26, 2007 at 03:46 PM | Permalink | Comments (0)

Working in NetBeans, I have noticed that Java DB is incredibly solid: it Just Works. It starts quickly, it runs well, it does everything you need it to do.

But in some ways I think this can actually mislead people into thinking "well, it's so small and easy to use, it must be just a developer database." Sometimes I wonder, maybe if Java DB required a big installation and lots of set up, people might tend to think it's a Real Database. You know, like all those other Enterprisey tools -- it must be enterprise ready, because it's so complicated and "rich" in functionality.

So I thought this email from Nurullah Akkaya on the derby-user list was quite telling. Nurullah switched from Oracle to Derby to handle a cluster of servers, each with its own instance of Derby running embedded and inserting a million rows a day. Each database instance is 20GB and growing. And he says that his system performance increased significantly after doing the switch. You can't do this with "just a developer database."

FON fun

Posted by davidvc on February 23, 2007 at 09:55 PM | Permalink | Comments (0)

Thanks to a tip from Simon, I was able to get in on the deal to get a free FON wireless router.

What is FON, you may ask? Well, it's kind of like a home exchange for network access points. If you use their router to set up your location as an access point for FON members, then you get to use any other FON member's access point absolutely free. If you're not a member, you can still use it for a nominal fee of $2 a day. You can also be a Linus - you give your bandwidth for free and you get free access to access points - or a Bill - you want to get a cut of the fees an Alien pays (someone not in the FON network). I'm a Linus, of course :)

They have a great Google Map mashup that lets you search for access points. Just for kicks I looked in my neighborhood, and found at least a dozen within a mile of my house. I also checked out Prague, since I'll likely be going there soon, and again, at least a dozen in the downtown vicinity.

They are insidious and sneaky in a fun way, too. Check out this blog where they offer you a free FON router and a 50% cut of the fees if you live near a Starbucks.

Now, I suspect many of these access points are at people's homes. So what does that mean? Do I have to sit in my car or on the sidewalk sucking bits like a homeless net-vampire? Ah, well. At least, when I just have to send out that email or post that blog, I know I'll always be able to find a friendly signal in my neighborhood.

Now that's a PowerPoint Presentation

Posted by davidvc on February 22, 2007 at 12:57 PM | Permalink | Comments (1)

This was shown to all of us by Jim Bisso at our Visual Web engineering meeting today. An excellent parody of PowerPoint marketing, and great in it's own right.



As If Keeping Up Wasn't Hard Enough: Enter Parallelism

Posted by davidvc on February 20, 2007 at 11:53 AM | Permalink | Comments (1)

I should have seen it coming. The company I work for is putting a lot of energy behind multi-core CPUs. Scaling on multi-core chips is becoming more and more important. But do I think about building a program so that it is highly parallel? No. I have enough to think about.

But there are signs that I need to start thinking about this. And if you aren't, you probably should.

Sigh. I'm just starting to get my hands around AJAX and REST and Ruby on Rails and meta-tagging and the Semantic Web, and, and... Now this. Can I go home now?

The Web as a Database?

Posted by davidvc on February 14, 2007 at 08:39 PM | Permalink | Comments (2)

Alex Iskold talks about how Yahoo! Pipes enables us to use the web as a database, showing the similarities between structured queries over the relational model and structured queries over RSS/Atom feeds.

The Semantic Web folks have a very similar vision, but the power of Yahoo! Pipes is that it takes advantage of an existing web standard (RSS/Atom) and does not try to impose yet another meta-model on top of what already exists. In this way it has the same advantage that REST does over WS-*. And of course you sacrifice exactness and completeness with this approach, but it's simple and it works.

I played with Yahoo! Pipes. I think it's a great vision and a great model. My own particular pipe, an aggregation of references to me in blogs and elsewhere, had no end of difficulties. I tried to filter out my own blogs, but that was almost all I saw. I tried to filter out other members of my family, no luck. I tried to order by date, and the stuff came in random order. Other people are much more successful, so I'm willing to chalk this up to my own "not getting it." But to me it's a sign that it doesn't have the same approachability as throwing together a web page in DreamWeaver.

But let's put that aside for now, and look at the vision. It's a very cool vision. Feeds in, feeds out, and then string them together, applying various operators and a little bit of looping and flow control, doing it all visually. If you've ever looked at a query tree, it has a very similar model. Tuples in, tuples out, and various relational operators being applied at each node, either joining or sorting or filtering.

I can envision taking my internal relational data that I want to make available to the Web, and delivering it as an Atom feed, using the REST model where each domain entity (a database table or view) maps to a URI, a web resource. Then I could write some pipes that provide useful views into that data. Hm, that shouldn't be too hard to do - I may go try and pull that together...

Doing all this through a visual approach is going to get clumsy. At some point people will want to do JavaPipes and RubyPipes and AJAXPipes. Having a REST-based API to do this is one approach; another is to provide a library that does all the work on *your* server (rather than having it done on Yahoo's servers through their REST API).

This ties into my concerns about scale. If Yahoo! is hosting all these pipes, can they handle the demand? How do they handle millions of users hitting pipes that gather, sort, filter and apply foreach operations? I'm sure they have smart people looking at this; if they aren't already, they should be talking to the database folks, who know all about query optimization, query compilation, caching and so on. Luckily Pipes are read-only (they have to be, because they are produced through transformation of real data -- it's just like you can't update a computed column in a database). This means they don't have to worry about locking and contention.

The other issue about Yahoo! hosting the pipes is the issue with the freedom to leave, which I've written about before. I've written these cool pipes, and then I get made mad at Yahoo! and want to leave, or they get new management who wants to charge me for my pipes, and I've built a whole service around them. What I really want to see is an open source implementation of the API that I can run on my machine or host somewhere else. Make this something that helps the Internet take off in an incredible way, rather than tying it down to Yahoo's servers and Yahoo's storage and Yahoo's UI (cool as it may be). Come on Yahoo, tear down the walls and set this bird free.

Data Mashups Made Easy: Yahoo! Pipes

Posted by davidvc on February 09, 2007 at 02:52 PM | Permalink | Comments (3)

I have been thinking for a while about how you data mashups: the ability to query and combine data source across the web into new data sources. I've looked at the Semantic Web, I've looked at Google Data and Amazon S3. To me what you want is a very very simple way to query web services in a way that a program can understand.

Now there is a new kid on the block for data mashups, and a seriously good one too: Yahoo! Pipes. What's very cool about this is not only does it let you combine and sort and filter RSS feeds in various interesting ways, but it lets you do this in a very easy to use, graphical fashion . This really is data mashups for the masses. It's also something that Mr. O'Reilly says he's been waiting ten years for

In the spirit of UNIX pipes, anyone can take a pipe I created and pipe it into theirs, or modify it to meet their needs. Over time I can imagine some very rich pipes built on pipes. I remember having fun with this in UNIX-land, and now here I am doing it for web feeds. Pretty cool.

Just for kicks to see how hard/easy it was, I created a pipe that combined searches for references to me from , Google Blog Search and Technorati (I'm nothing if not narcissistic :)).

I was able to sort by date in descending order, filter out duplicates, and filter out hits that were the blogs that I wrote (Google Blog Search finds my blogs as well as others').

One thing I couldn't figure out was how to parameterize the actual URL so I could pass in anyone's name, not just mine. But I'm sure somebody smarter than me can figure that out.

I ran into a couple of issues. In particular, I kept getting errors saying "Error processing your request" with absolutely no helpful information to tell me what was wrong. Also, even though I tried to filter out my own blogs, they still were showing up in the results (update: I finally figured out how to fix this -- but references to other members of my family are still showing up. Ah, well, good enough, and I have to move on...)

I also tried adding the resulting feed to my Google Reader using the "Add to Google" button -- oops, for some reason Google Reader says it has no items, when I know it does. When I used "Get as RSS," however, it worked great.

So, I guess there are still some bugs to fix. But the concept and in general the execution is excellent. I hope they are prepared for some pretty heavy load :)

What I really want to see happen is making it easy to do visual data binding from a NetBeans web app project onto a pipe like this. Just as you can today with the Visual Web Pack for databases, you should be able to visually bind this data source to a table in your web app. Now that would be cool. I guess I better get busy :)



Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds