|
|
||
David Van Couvering 's BlogNovember 2006 ArchivesOracle Benchmarks BDB vs Apache DerbyPosted by davidvc on November 29, 2006 at 02:49 PM | Permalink | Comments (4)I found out on the derby-dev list today white paper from Oracle (PDF) comparing Apache Derby to their newly acquired Sleepycat BDB Java Edition First of all, it's a great complement that Oracle would consider putting the effort into running the benchmarks and writing the white paper. That must mean they're getting peppered with questions about why they should choose BDB over Derby. The white paper claims (without any source code or details) that BDB Java Edition consistently outperforms Derby by a factor of three to ten. I actually believe their measurements are suspect -- see the end of this blog for some juicy details. It's also problematic that they don't provide a link to the code, nor do they describe their test runs in any real detail. But lets assume for now that BDB is faster than Derby. They admit that "in some areas, this comparison is apples-to-apples, and in other cases apples-to-oranges." Well, no kidding -- BDB does not provide any SQL support., and thus doesn't have to pay for the overhead of the SQL layer. But if you want SQL, then this is kind of a problem.
If you don't want SQL, then it's worth considering BDB. But I think you need to be very clear about the path you're taking prior to jumping over to BDB:
The other interesting point they make in the white paper is that BDB JE's support for native storage of Java objects provides big performance benefits over the Java Persistence implementations provided by Hibernate and others, because you don't have to map between Java and SQL. Again, a good point on the face of it, but they forget to mention that the BDB object interface is completely non-standard and is a wide open door to vendor lockin, whereas JPA and JDO are standards with multiple competing, and often open source, implementations. I think there is a lot of need for a simple key/value transactional data storage for Java, where SQL and querying is not needed. Right now the only real player in this game is BDB, but it is non-standard and owned by a single vendor. It would be great if we could define a Java standard for key/value storage so that a user doesn't get locked in to a particular solution. Something similar to JPA or JDO, but which is significantly simpler, where we have an EntityManager that does simple get/put operations on Java objects and provides transactional semantics. No query support, and none of the overhead that comes with that. JavaSpaces defines something very much like this, except that it is not a standard. Perhaps it can be taken in that route... Maybe Oracle would be willing to work with the Java community to offer their expertise in this area and help define such a standard. Then other folks could provide alternate implementations, customers would have the freedom to leave, and then we'd really be comparing apples to apples...
Under the Hood...
The even say in the text "while both product are disk bound, JE is still significantly faster than Derby." What bunk! The next graph actually does show a significant difference (and actually does start at 0), but this graph is with the disk write cache enabled.
They even call them "non-durable writes." Hm, that doesn't give me a warm fuzzy. As I write in the story of the write cache and half a worm, having the write cache enabled is really exciting in terms of performance, but it does have drawbacks, such as potential loss or corruption of data. No biggie... Reading on, it turns out that all subsequent performance comparisons are done with the write cache enabled. Hm...
It goes to show, again, that performance measurements are dangerous things to count on others to do, especially those with an agenda. I would recommend
you do your own performance tests before you make any decisions to commit yourself to BDB, and Oracle, for what could be a very long ride...
LAJAX on IEPosted by davidvc on November 28, 2006 at 05:30 PM | Permalink | Comments (0)Francois Orsini provided an example of how to embed a database in a browser using Mozilla Firefox. He just let me know he received a comment from Giorgio Arata describing how he got it ported to IE Here's some of what Giorgio says. You can follow the link to see his full comment and all the code: First of all we need SARISSA; a cross-browser ECMAScript library that offer a unified XMLHttpRequest/XML/XSLT/XPath along with some bonus utility methods. Please visit Sarissa on SourceForge and add exactly after META tags in your index.html page. In the second place I've exchanged the deprecated APPLET tag with a crossbrowser nested OBJECT solution. Code example follow, just substitute and run the application. ... Mozilla remains the more stable platform and do not have nasty crash simulating unexpected silly user behaviours. Best regards, GA.
I love the part about "nasty crash simulating unexpected silly user behaviours." Boy, that about sums up my experience with a lot of software out there... :)
Open Source Databases Reduce TCO by 60%Posted by davidvc on November 21, 2006 at 04:03 PM | Permalink | Comments (0)I remember in the late 1990s when I first heard how many companies were switching over from Windows to Linux. There was story after story from my IT friends how they grabbed a cheap box gathering dust in the corner that could barely move under the new Windows releases, and they would install Linux and Apache on it, and it would just scream. They would rave about performance, stability, and cost. I couldn't believe it. Corporations running their production systems on an open-source operating system?. Amazing. Open Office has a similar story. More and more people are discovering that they can get what they need out of Open Office without having to pay the Microsoft tax. In both these cases, companies were weighted down by premium costs from vendors who have a virtual monopoly on a key component of their infrastructure, with no options in sight. Nobody ever believed open source could deliver functionality equivalent to what they were paying for. An operating system built on open source? No way. A competitor to MS Office, where everybody is donating the code? Impossible. The latest "impossible" story is databases. Databases are particularly problematic because not only are they core to the business, more than almost anything else, and not only are they complex -- they are also a bear to migrate from one vendor to another. There are so many different levels of lock-in, from database APIs to differences in datatypes and SQL, all the way down to key behaviors in terms of locking, caching, and so on, where an application that screams on one vendor runs like a dog on another. I experienced this first-hand when I worked at Sybase, where SAP refused to port to Sybase in part because we didn't implement row-level locking. This decision by SAP was one of the key reasons (IMHO) that Sybase lost the war with Oracle. For these reasons, many people doubt that open source databases will ever make any kind of real dent in the database market. Sure, you can use these guys for your web apps, but they don't have the features and support required for corporate data centers, and migrating is hard, risky and expensive. But history with Linux and Open Office tells us a different story. Also, we do need to remember that it wasn't too long ago that mainframes owned all corporate data. Mainframes are still key to businesses, but many companies put in the massive effort (sometimes taking up to ten years) to migrate from mainframes to UNIX-based relational systems. Why? Cost. Mainframes are exorbitantly expensive. For a significant reduction in cost, companies could get the same functionality out of UNIX relational databases like Oracle and Sybase. I remember that some of the best years at Sybase was when the economy was in a downturn and everybody was trying to save money. So, these companies did the math, and they made the effort. Now the big proprietary UNIX relational vendors are the gorillas on the block, charging exorbitant fees "because they can." The upstart open source databases are these annoying little yap dogs barking at the Big Guys heels. But even though they get a lot of press these days, they continue to have very little of the overall database market share. What this means to me is that I think most IT shops still do not believe it is safe or worth the effort for them to move their big Oracle or DB2 databases to MySQL or PostgreSQL. But it's making more and more sense for those apps on the edge, and the edge is starting to move in. I believe over time people will notice that open source databases are often good enough, they will look at the numbers, do the math, and they start making the effort.
Perhaps today the open source databases don't have the features, the performance and scalability, the ISV support, and so on that the Big Guys have. Perhaps migration is difficult. But if history is any indicator, some day we may wake up to find ourselves in a Brave New World in the database market...
What? It's Not About the Money?Posted by davidvc on November 20, 2006 at 03:07 PM | Permalink | Comments (3)I've had a lot to say about the Web 2.0 Forum, but the most wonderful moment was during a panel with a number of folks running collective intelligence sites. Jim Buckmaster, CEO of craigslist was on the panel. In case you don't know, craigslist is regularly listed as one of the top ten web sites, with only 20 employees. For the most part Jim was silent, quietly sitting and listening. But at one point during a discussion about "monetizing" (a horrible word if I've ever heard one), he mentioned that many people had suggested to him that craigslist place context-sensitive ads on their pages, telling him how much money they could make. Tim O'Reilly (the moderator for the panel) asked him why they hadn't done that, and Jim very calmly said "it's not something our users ever asked us for." There was this stunned silence from both Tim and the panel. After a few moments the audience broke into spontaneous applause. It reminds me of a story I heard once about a southern fisherman sleeping on the bank of a river. A northern industrialist came by and said "hey, why aren't you out there catching fish?" The fisherman said "oh, I've caught enough fish for today." The industrialist said "but you could catch more fish!". The fisherman said, "why would I want to do that?" "Well, you could sell them and make more money!" "Why would I want to do that?" "You could buy a bigger boat!" "Why would I want to do that?" "Well, you could catch more fish!" "But why would I want to do that?" "You could sell more fish and make more money!" "But why would I want to do that?" "Why it's obvious, with more money, you could retire and then you could relax and enjoy yourself!" The fisherman looked around and said, "but I'm already doing that!"
It's something I think about a lot...
Why Use Java DB For Web Client Storage?Posted by davidvc on November 17, 2006 at 04:28 PM | Permalink | Comments (9)I've been wanting to write about the value of a relational database (and Java DB in particular) when implementing local storage in web clients. The announcement of Zimbra's offline support and the dialog on their blog of why they chose Derby over dojo.storage has motivated me to get these thoughts out there. Why would you want to use a relational database (and particularly Java DB) for local storage rather than other solutions, such as the WHATWG API implemented in Firefox 2.0 or the dojo.storage package , which provide a simpler key/value storage mechanism? Zimbra uses Derby for offline storage!Posted by davidvc on November 17, 2006 at 03:45 PM | Permalink | Comments (0)When I was at the Web 2.0 Forum, there was a chat with a number of startups who got launched last year at the conference, and one of them was Zimbra. It was at this session that Zimbra announced their support for running the Zimbra client offline. I was impressed and was going to blog about it as a real-world example of the need for offline support. Then I bumped into Stephen O'Grady's blog, and found out that they use Derby. Wow! Given that the Derby team has never heard a peep, that I know of, from the Zimbra team, this was quite a surprise (and I believe a sign of how robust and easy to use Derby is). See the the comments to the Zimbra blog for some tantalizing details about how they did it and why they chose Derby.
Very interesting to see they're using Lucene for search and indexing. A number of folks in the Derby team believe there's an opportunity for tighter integration between Derby and Lucene if only we had time to work on it...
Venture Capital: Just Say NoPosted by davidvc on November 14, 2006 at 04:25 PM | Permalink | Comments (2)On Thursday morning at the Web 2.0 forum John Battelle had a fascinating conversation with investors Ram Shriram and Roger McNamee I first have to say off the bat that Ram Shriram had a wonderfully relaxed and friendly demeanor. I found myself feeling like I was his best friend, and that he had my best interests in mind. I am sure that quality serves him well in the role of investor and guide with the various startups he works with. What surprised me about this discussion was how clear it was to all of them that starting a new Internet venture these days requires such a small amount of capital. Hardware is commodity (and now you can get incredibly cheap deals from Sun with their startup program), all the software is open source, and all the resources you need are available on the Internet. Often the founders can just work out of their homes and use the Web for communication and collaboration. The impact on the venture capital community is significant. Traditional VC firms usually look for companies where they can invest 20 to 50 million dollars and see a return in three to five years. That's a lot of money, often much more than you need. Ram and Roger were generally recommending that you avoid taking this kind of VC money if you can help it, as it usually comes with expectations that you take the company to a certain net worth within a certain time frame so they can get their money back. As a result VCs exert a lot of pressure and control on the company. You've pimped yourself. Ram also observed that when a startup takes too much money, it's bad for the business; the founders often don't know what to do with all the money, and you tend to lose focus. He said he likes to see his ventures running lean and mean, eating noodle soup and working out of a garage. So, they recommended that you take as little money as possible - just enough to get by - and you'll turn out better in the long run. I found this all very invigorating. If I have a good idea (I'm not saying I do, but if I did), it means I could actually get a couple of angel investors, get some cheap hardware from Sun and do everything with open source and deliver something very cool without having to go and pay court to VC firms. Something to think about, for those of you who do have a good idea.
One last tip from Roger McNamee: starting a business is a lot of work. If you're passionate about something so much that it's all you can think about, go for it. But if you can think about something else, then maybe this is not the time...
The Great Database In The SkyPosted by davidvc on November 13, 2006 at 05:34 PM | Permalink | Comments (1)On Thursday morning at the Web 2.0 Summit, Marten Mikos of MySQL talked about "The Great Database In The Sky." His vision: open source structured data. Today you can search unstructured data through Google, but there is no open access to the world's structured data. He had an example:
Now, the question arises, how does this differ from the Semantic Web? Great question, and someone asked him that. I may be misquoting him as I wasn't able to write down his answer, but I am pretty sure he said that this could be thought of as a subset of the Semantic Web, because the Semantic Web can put structure around unstructured data (such as a collection of URIs). In my mind, the awkward silence that followed was because many of us were thinking "well, if it's a subset of the Semantic Web, hasn't the problem already been solved?" As I understand it, the Semantic Web is placing structure and semantics around Internet resources, URIs, whereas relational schemas place structure and semantics around tables. But isn't a table conceptually the same as a URI? I know that the REST model argues that every concept or "object" in an application domain should have its own resource, its own URI. So if you apply this rule, isn't there a mapping between relational model concepts and the Semantic Web? A quick Google shows that Tim Berners-Lee said something very much like this in 1998. So, there is already a web-based model out there that seems to handle relational data. So why haven't we seen The Big Database In The Sky? In my opinion, the issue is the same issue that has dogged large companies trying to integrate various acquisitions they have made, or with companies trying to do share data: data integration. Nobody calls the same thing the same thing - you say tomayto, I say tomahto, a rose by any other name would smell as sweet... If you look at the Semantic Web, it allows for global schemas that span multiple domains, but doesn't require it, and they tend to recommend against "boil the ocean" attempts to unify schemas. This is a hard problem.
I can envision committees or open source communities coming together
trying to define common schemas. And where the value is there, perhaps
this will happen. But enabling global search and update across
relational/structured data is just not as easy as what we're seeing
with unstructured search (Google) and folksonomies like
del.icio.us,
flickr and YouTube.
With "real" structured data, you have to define
your structure ahead of time, and that's a problem, because things are
always changing in the World Wild Web. It reminds me of this Dilbert
I saw where Dilbert has this immensely complicated flow chart he's
presenting, and he says, "and here we are having this meeting." Somebody
says "I have a question" and Dilbert says "oh, now I have to rework
everything."
Richard Stallman loves JavaPosted by davidvc on November 13, 2006 at 11:51 AM | Permalink | Comments (0)I remember in the mid-nineties doing engineering support at Sybase, debugging nasty memory leaks and memory corruption caused by users' incorrect use of C pointer semantics. I loved the object-oriented design paradigm that was gaining hold, but as someone who had spent hours tracking down memory errors and deciphering strange, mangled code constructs, I looked at C++ in horror and hoped desperately for something better. I encountered Eiffel and loved the concepts behind it, but I couldn't use it because the cost to license just to build stuff with it was so horrendously high. Then along came Java, and not only was it an excellent language, solving many of the maintenance nightmares I had encountered, and supporting the new OO paradigm, but you could use the compiler absolutely free, and you could redistribute programs you wrote in Java absolutely free. This was revolutionary! Now that looks so old-fashioned, so restrictive, but back then you normally (except for gc++) had to pay just to get a compiler, let alone redistribute software you built with it. I remember "borrowing" MSDN CDs from others in the company because the VC++ compiler was a hefty $500 a pop. And forget about free IDEs, that was a fantasy nobody expected to ever come true. Well, we've come a long way. As you've probably already heard, today Java is going open source, and not just Java SE but Java ME as well. You can read more about it here. What I like the most about this announcement is the videos from folks like Brian Behlendorf, Tim O'Reilly and Richard Stallman. Hard to imagine Richard Stallman going on record praising Java, but why not, it's going out under GPL.
At various open source conferences, I've lunched and dinnered with Simon Phipps and the team who have been working with the open source community around open sourcing Java, and I've seen how hard this has been. My congratulations to everyone both inside and outside of Sun who have worked so hard on this. In particular my hats off to the Java SE, ME, and EE teams. They have been asked to take their baby that they have cared for so carefully for so many years and open it up to the world, and be asked to trust that all will be well. I can only imagine how difficult that must be. Now what? Well, as an advocate for Java on the desktop, on mobile devices and in the browser, my hope is that this move will make Java even more ubiquitous on these platforms. Using GPL means that the Linux distros will be quite happy to redistribute Java. Maybe the Java plugin for Firefox will just be there and won't require a separate install. This also opens Java up to key governments like Brazil, where all technologies are required to be under an open source license, not just because it's cheap but also because it assures that these technologies will not be controlled by foreign third parties.
So, my hope is that this will finally remove the stigma around Java in the open source and government communities, and as a result you will not only have write-once, run-anywhere, but also write-once, runs-everywhere.
Wednesday at Web 2.0 ForumPosted by davidvc on November 08, 2006 at 09:15 PM | Permalink | Comments (0)Walking into the Web 2.0 Forum I saw some gorgeous ornate brasswork on the door.
I also thought you'd enjoy an example of the kind of stoic portraits hanging out on the walls of the hotel
The place continues to buzz with activity. I found myself surrounded by CEOs and entrepeneurs at every turn. A real quick look around the tables during lunch (lots of deals going down), revealed Marc Andreesen, Jeff Bezos, and Marc Benioff. By the end of the day I was so high on it all, I felt that I too could start a company! YES! I sat in on the Wednesday morning session, , which was basically a series of ten minute presentations from some very high-level folks. Here are some notes from the ones I thought were most interesting Web 2.0 - Oh No!Posted by davidvc on November 07, 2006 at 04:12 PM | Permalink | Comments (1)It was my first day at the Web 2.0 Conference here in San Francisco, and things didn't work out as planned. The person who had my badge had to make an emergency trip down to Mountain View, and here I am in a Starbucks across the street from the Palace Hotel like some kind of outsider who wanted to get into the "in" club but got bounced. The Palace Hotel is literally four steps from the exit I took out of the BART subway station. I opened the tall glass doors with ornate brasswork into an elegant lobby buzzing with the hum of eager entrepeneurs. I'm recognizing that my jeans and flourescent yellow ApacheCon Hackathon T-Shirt don't quite fit in with this crowd. This is about slacks and dress shirts. This is about making a deal. There were pods of folks standing around with these intense looks on their faces having impassioned discussions. The impression was that they are full of vision, full of drive, looking for the angle, making contacts. They are (or want to be) the movers and shakers of this new wave of business opportunity. I tend to think of Web 2.0 as a set of technologies that enable a new way of building applications. Here, Web 2.0 is about a new business model, a new way to make money, a new way of thinking about markets. It's exciting in its own right, but it's a completely different angle and approach from the one I'm used to taking in my techie white tower. My brother Antony Van Couvering, who has already built and sold one Internet business (NameEngine, now part of Verisign) and is now in the midst of building a new one (Names@Work) would grok this conference much better than me, I think. He's always got new angles, new ways of thinking about business and business models, and he's passionate about what he does. It's interesting that this is happening in the Palace Hotel. This is a small, intimate hotel with old-fashioned architecture and interior design. The central restaurant is covered with a beautiful ornate glass ceiling and rich chandeliers.
So much about Web 2.0 is new, and in many ways ever changing and disposable, while the Palace evokes a different, slower-paced time. You see all these fast people and fast-moving ideas happening under chandeliers and dark oak stains and Victorian portraits. It's an odd juxtaposition.
I finally decided to get back on the BART train and head home. I
dipped my toes in the water, but due to logistical SNAFUs
that's all I'm good for today. We'll try this again tomorrow.
I'll definitely make sure to wear my slacks and button-up shirt
and bring a stack of business cards. I'll practice shaking hands
and making visionary eye contact in front of the mirror tonight.
JDBC 4 is approvedPosted by davidvc on November 07, 2006 at 12:19 PM | Permalink | Comments (0)I just got the news that the JSR 221 (JDBC 4 API) specification has been approved
This is great news, and is the end result of a lot of effort. Congratulations to the expert group and to Lance Anderson, the spec lead for JSR 221.
Whoa! Web 2.0!Posted by davidvc on November 06, 2006 at 12:36 PM | Permalink | Comments (1)I pinged an alias here at Sun asking who was going to the Web 2.0 Conference here in San Francisco, and started discussing with Sharat Chander about getting my sample app onto the Sun booth, and was surprised to be offered a pass to go. Whoa! A nice surprise, thanks, Sharat! I'm not sure what to expect, but I suspect a lot of marketing and hype, the money's getting behind this game big time these days. I'll try to make sure my Hype Sunglasses are on and functional When I'm not at sessions, I'll be at the Sun booth demonstrating the app that I wrote for ApacheCon where you can run an calendaring application offline in a browser and then synch with Google Calendar when you get back online. We'll also be showcasing some of the other great things Sun is doing around Web 2.0, such as jMaki, the Phobos project, Sun Java Studio Creator's support for AJAX, and Portal Server 7's Web 2.0 features.
I am sure there will be a lot of folks covering this event, but f I see or hear anything I find particularly compelling or interesting, I'll let you know.
The Impact of Open Source: An Analysis of the Database MarketPosted by davidvc on November 03, 2006 at 09:08 AM | Permalink | Comments (3)On the hackers-db mail alias today (an alias for folks working in open source databases), Lukas Smith posted a little nugget saying his research paper studying the database market and the impact of open source was done. He talks about it more in this blog. From his introduction: A bold statement on market evolution states: "Commoditization is something that happens to every successful industry eventually" (Murdock 2006, 91). Commoditization is the process whereby technology that was previously owned exclusively by a few vendors, and therefore priced at a premium, becomes widely available at comparatively low prices. Due to its open nature, open source software could likely be a critical factor in facilitating this development. This paper aims at examining to what extend this statements holds true within the software market and more specifically the RDBMS software market. But more importantly, the purpose of this paper is to study how individuals and companies can leverage open source in mature markets. Definitely worth a look-see. Let's Get PersonalPosted by davidvc on November 01, 2006 at 04:52 PM | Permalink | Comments (1)I decided there are other things I'd like to talk about that have nothing to do with java.net, so you'll find them at my new personal blog. I actually really like the new Blogger Beta interface, especially how easy they make it to customize for folks like me who are not major HTML/Javascript hackers. I was able to put up my delicious links and my Flickr photostream with pretty minimal effort
Here's hoping what I say over there doesn't reduce my job prospects or embarass me fifteen years from now. And yes, I do like to take photos of my kids :)
| ||
|
|