 |
Do We Need Databases on the Desktop?
Posted by joshy on July 17, 2006 at 10:28 AM | Comments (23)
Recently Simon Morris posted a blog called
In defence of the desktop where he asks :"If SE is truly the edition of Java aimed at the desktop, and most real desktop applications (browsers, players, word processors, video editors) are not database heavy, why is Java DB being included in the SE JDK?". I'd like to challenge the idea that real desktop applications don't need databases. They may not be database heavy (in that storing data is not their primary function) but I do think that there are a lot of desktop apps which use databases, or could be improved by doing so.
When it comes to desktop apps I don't think of a database as "table based storage for relational data accessed by SQL". I prefer think of it as "reliable and search able storage", since that's what it really means to desktop apps. When I wrote my two part series on using Java Persistence in desktop apps, that's what I was thinking.
So do desktop apps need a database? Rather than say yes and describe why this is so, I thought I'd simply go through the applications installed on my computer and speculate about which ones have or could have a real database inside.
Here is the contents of /Applications on my MacBook running Mac OS X 10.4:
- Address Book Well, duh. It's a big searchable list of data with well defined fields. Okay, next one.
- Adium and iChat and Skype These are instant messenging or audio chat applications. The actual storage of contacts is either done in a flat file, derived from the aforementioned Address Book, or stored on the server. Thus the contacts probably don't use a database. However, the chat logs certainly could. You've got lots of records, organized by date and participants, and fast searching is highly desirable. Certainly a good candidate for a local database.
- Adobe Photoshop CS, Adobe Reader 7.0.7. I can't see many direct uses here.
- Automator This is a Mac specific development tool for building scripts that control other programs. While the scripts themeselves are stored as flat files, the list of available bindings for every applications certainly need to be stored and searched quickly, even if the underlying storage is XML files in the relevant apps. Much like other development tools Automator (could) parse the application bindings into a local searchable storage cache. Yes on the DB.
- Bugster is a Sun app which searches the bug database. While it's normally searching a remote database, I can see how you would want to use a test database while developing the software. Still, it's not using a local database at deployment time so it doesn't count.
- Calculator. It's a calculator. It calculates. Doesn't really need a database.
- Camino, Firefox, Safari: webbrowsers. You might not think that they need databases but they do have two things that need to be stored and searched: history, and bookmarks. Both of those could greatly benefit from using a real database underneath.
- Brood War, Chess, Snes9x, Starcraft: games. No direct need for a database.
- Cyberduck, Fugu, LimeWire, Democracy: file transfer or media downloading applications. They could use databases for searchable records of what you've transfered and when. (none of them do currently, however)
- Dictionary : a dictionary is a database. 'Nuff said.
- Font Book : while it does search through a dataset, the data comes live from installed fonts so it probably wouldn't benefit from the database.
- LEGO Digital Designer, Google Earth: these are both interesting examples. They are both very non-traditional applications. They are both very graphics heavy.
And they both store local searchable caches of networked data. I think they could both benefit from internal databases.
- DVD Player, HandBrake 0.7.0, MPlayer OS X 2,
Image Capture,
Image Tricks,
ImageWell,
Preview,
Photo Booth,
QuickTime Player,
RealPlayer,
VLC,
Windows Media Player: These are all straight up media playing
and manipulation applications. Some of them include a list of tasks they have done before (DVDs ripped, for example) but none of them really take advantage of databases
- Audio Hijack Pro, Audio Hijack: this is an interesting case. These are media manipulation apps but they are used constantly and can produce a lot files which need to be managed. A good case for a database.
- Mail: tons of searchable records going back years. Definitely has a database in it.
- MarsEdit, NetNewsWire: These are applications for posting and view weblogs and other rss feeds. They both need detailed records of transactions (what you've posted and read) and keep extensive caches of network stored data. Definitely a need for a local database.
- Microsoft Office 2004, Swift Publisher, TextEdit,
Stickies, RapidWeaver. Word processors, page layout, and text editors. Nothing very databasey in these other than integration with external databases.
- NetBeans, jEdit 4.2: these are both text editors at their heart, but being development tools they need lots of cross references for managing classpaths and doing code completion. They typically scan the classpath at startup or install time and store the results in a structured cache. That cache could very easily be an embedded database (and would be a lot easier to maintain).
- Mori and OmniOutliner are both list managers that could very easily use a database for storage. Lots of entries to be searched, though it's not quite a structured as something like iTunes. Lucene might be a better choice than SQL, or perhaps the combination of the two.
- Parallels. OS Virtualization software. No real need for a database
- PostgreSQL and SQL-Ledger these are actual database tools so they don't really count.
- Quicken 2006 pretty much is just a database with a very fancy front end. It stores, searches, and reports on financial data. I'd say that's a yes.
-
Quicksilver, iTerm.app,
VPNClient, Internet Connect,
System Preferences,
iSync, Missing Sync for Palm OS,
Sherlock
: these are all system utilities. With the exception of the syncing programs which have large datasets to deal with most of these don't really need databases.
- iCal, OSX's system wide calendar database. The actual storage is part of the OS rather than confined to a single app, but this only makes the case for using a database even stronger because reliability is even important.
- the iApps: GarageBand, iDVD, iMovie HD, iWeb, iPhoto, iWork '06 These are the famous iApps that Apple sells. Most of these can use an internal database for managing the media they work with. iPhoto in particular has thousands of photos to search through.
And the granddaddy of all local database apps
- iTunes: this is the quintessential example of a local app with a database. Megs of structured data with high performance searching. I can't think of a better use for an embedded database. And many, many people use it.
-
Conclusion
Databases are used, or could be used, in many more desktop apps on my laptop than I expected. I further expect us to come up with new applications in the future. Back in 1999 I didn't think of either iTunes or iPhoto. What will the next seven years bring?
One place I see room for improvement is that Java Persistence isn't as useful for remote databases because you'd have to give direct access to the database over the internet to a locally downloaded application, which could be easily reverse engineered and hacked to manipulate anything in the database. If there was a way to proxy the persistence calls through a system that could filter and control access then Java Persistence would be even better.
Bookmark blog post: del.icio.us Digg DZone Furl Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment
-
It should be <b>Bugster<\b> instead of <b>Bugster<b>. It just hurts the eyes otherwise.
Posted by: kirillcool on July 17, 2006 at 10:59 AM
-
I only had to read your conclusion, because that's what it's all about to me. There is simply no persistence mechanism or framework for Java right now, that is able to use remote data storage without JDBC.
In my everyday work I develop desktop clients, which are connected to a server. The only way not to run into serious trouble with customers is using a transport mechanism working over port 80 (HTTP). That usually saves a lot of time for bureaucracy acts you seldom win. So it all comes down to XML-RPC or SOAP. I myself chose XML-RPC and so I have to serialize my POJOs back and fort from/to XML-RPC and same again on the server side, but with PHP.
Pretty much work to get a well working framework for both wordls up and running, but definately a must in my case. These issues are the main reason why I am sceptic about recent overwhelming jMatter blog entries. It may work for a lot of people, but once you try to get into some global players, you'll fail miserably wrestling their IT staff.
In the end I'd like to see some set of frameworks working together to create a news framework. What Rails or Grails or whatever mean to web development is needed to create some persistence mechanism. I want to configure the database access paramters, say what languages I want to use on the client or server side, push the button and get some ready-to.work code delivered.
I was pretty surprised I wasn't able to find a glimpse of a solution out there. Nada. As if I am the only one with this kind of problem which just can't be the case.
Posted by: alarenal on July 17, 2006 at 11:04 AM
-
Sorry. I mised that open b tag and it didn't show up in my preview. fixed now.
Posted by: joshy on July 17, 2006 at 11:09 AM
-
Do you think some of the lighter db use-cases above could make do with the java.util.prefs library? For cases where you need a persistent store but only need key-value access (not a search across many fields), the ability to store arbitrary primitives, Strings, and byte arrays might be just enough to handle some of caching and other light storage needs indicated above. The fact that these prefs may be backed up by OS-specific db-like systems is invisible to the developer.Also, some of these have sneaky data storage requirements you might not have thought of. DVD Player remembers every disc you've ever played and where you were the last time it was in the drive, and QuickTime Player remembers the screen location, size, affine transforms, etc. of any movie you've played (though this is accomplished by storing these items in the user data of the movie itself, not in a database or other app-specific store).
Posted by: invalidname on July 17, 2006 at 12:23 PM
-
Certainly you could use prefs for some of these things. You can always start with prefs and move to a real db, most likely when you hit one of a couple of boundaries: the size of your data is large and slows down the prefs system, you have to do serious searching, your data is structured and using Java Persistence would save you a ton of serialization headaches. Ultimately it's up to the developer to choose the right tool for the job. I just think that a DB is now a better tool for a lot of things that we previously would have done with properties files or XML.
Posted by: joshy on July 17, 2006 at 12:32 PM
-
The primary benefit of a "database", particularly a SQL database, is its transactional nature. The benefit of being able to commit large changes in a atomic and reliable way should not be underestimated.
None of the other persistence techniques offer this, regardless of the format. Certainly other transactional stores exists, but none are as ubiquitous and readily understood by many as a SQL database.
There are potential side benefits of using a SQL database in that may 3rd party tools can access and query that data, through JDBC or ODBC (or both as bridges in both directions are available).
This added benefit can help in desktop integration efforts. In the Web World, we're now opening up data through the use of web services, but on the desktop, it's still interchange of file formats. So, accessibility through something like JDBC and ODBC can be quite powerful.
Finally, the machines are fast enough that the overhead of SQL processing is lost compared to the overall value such a subsystem provides.
Posted by: whartung on July 17, 2006 at 02:54 PM
-
I agree that many apps, including those that you just listed, use or could use a database. It sure would make programming easier (especially now that JPA is in SE) and would bring some runtime performance benefits.
However, one big problem I see is, once again, application startup time. Java desktop apps are already slow to launch (or am I stuck with some old data?), so I wonder how much longer the end user would be willing to wait when the app pops up the "Now starting database..." dialog.
So the question is: is the tradeoff between application launch time and runtime performance worth it? That's up to the developer to answer. The good thing is, he now has a real choice.
Posted by: gbilodeau on July 18, 2006 at 05:17 AM
-
The question is not whether desktop applications can't benefit from a database, but whether it is the business of a programming language to have an integrated database as part of its core API.
To me at least the answer to that last question is a resounding NO.
Posted by: jwenting on July 18, 2006 at 05:20 AM
-
Well, I just discovered that Photoshop Elements (at least, v4 on Windows) uses a MS Access database to store photo catalogue, tags, metadata etc.
Posted by: hopeless on July 18, 2006 at 05:28 AM
-
The JDK (as the acronym implies) is a kit for developers who use the language... not the language itself. Obviously, different developers will want different things in their kits.I don't mid if the developer kit gets bigger, but it would be nice if the runtime got smaller (more modularized) to minimize the extraneous stuff on the client-JohnR
Posted by: johnreynolds on July 18, 2006 at 07:59 AM
-
jwenting: remember that JavaDB is being added to the JDK, *not* the JRE. It is simply a developer tool. If you get the Flash Dev Kit it comes with lots of APIs that are not included with the Flash runtime but which you can choose to add to your app. I see nothing wrong with add more tools, libs, or examples to the JDK. The JRE is another matter, of course, and I really want to see us make it smaller and more modularized. Pack200 is a good start but there's more to do.
Posted by: joshy on July 18, 2006 at 08:07 AM
-
Actually I tend to find all these hidden proprietary databases quite annoying at times. I often find myself wanting to get at the data externally.. either for backup or more often integration with other apps, one-off scripts, indexing and the like. But everybody seems to love reinventing the same old schemas time & time again with these hidden SQLite, dbd or ms jet backends.
I'd love for itunes and windows media player music libraries to be fully integrated (wmp is the backbone of my much loved, but deeply frustrating media center, if I had the choice I'd sooner not use it). winamp too, why not? how many different music schemas do we need to do the same job? Is their secret proprietary nature a deliberate ploy to force us consumers to choose our sides? or don't client side developers tend to consider the bigger picture?
Same goes for my omea feed reader, love to share its embedded db knowledge base with other applications, would love for google desktop to be able to index it.
I think the open/semantic metadata issues are far more interesting
than embedded databases which as you say are old news really even for Java. Where are all the mashups for the client side connoisseur?
Posted by: osbald on July 18, 2006 at 08:12 AM
-
"Well, I just discovered that Photoshop Elements (at least, v4 on Windows) uses a MS Access database to store photo catalogue, tags, metadata etc."
I think that most of sophisticated photo manipulation / cataloging software have an embedded database. This happened already some years ago (I remember Thumbplus which embedded Access) and still happens today (Adobe LightRoom embeds SQLLite) - I would bet that also Apple Aperture does.
There is an enormous advantage in storing into a database all the settings about a very large number of photos: basically the possibility of creating complex queries.
The only things that disturbs me is that we are still doing this with SQL databases, which is a pretty old techonology. I mean - I perfectly understand that the enormous existing legacy still mandates for SQL in the J2EE world. But some years ago I hoped that today we'd have used OO databases for brand new desktop applications. I find myself still choosing SQL for laziness in learning OQL and related tools, maybe it would be more useful to put a OO database in the JDK if we want to push developers?
Posted by: fabriziogiudici on July 18, 2006 at 10:42 AM
-
A database for an application is one thing, but a database is not the same as an SQL (relational) database. Your file system is a database and, I agree, lucene is a better fit. Not even the address book is a suitable fit for a RDBMS (even if it is often shoehorned into one and thereby becomes unusable in other countries and also less searchable than it should have been).
Posted by: tobega on July 18, 2006 at 04:08 PM
-
Why do you think SQL is bad for an address book? It's true that street addresses don't often fit well into a RDBMS, though it's certainly possible to do without locking out international address. Just columns for street 1, 2, 3, 4, 5, 6 would do it. More important is managing multiple email addresses, IM accounts, photos, ring tones, and lots of other things which can be attached to an person. At this point it should really be called a Person database or a Contact book, but more people recognize what an Address Book is.
Posted by: joshy on July 18, 2006 at 04:21 PM
-
Since you raised the topic of Macs and integrated databases... what do you think of Core Data?
I think if a database does get put into the SDK properly, then it should offer at least as much functionality as Core Data.
It would be nice if the Mac implementation was like "okay, just slap some Java paint on Core Data and we're done here". :D
Posted by: rickcarson on July 18, 2006 at 10:25 PM
-
Of course an SQL database can be made to work, just as the filesystem can be made to work. It is important to be able to manage multiple information about your contact, much of that information you don't even know is possible when you create the app, so your table is basically 3 columns: contact_ID, info_item_type, info_item_data and it seems the table structure didn't add much. Alternatively, you can let the columns in the table be dynamically configured and add more columns as you need them, but I think this is regarded as a relational anti-pattern.
If you instead store the info about a contact as an XML document, you have a better fit. XML is made for creating new tags when you need them, and XML databases are made to handle that, and allowing both free and structured searches. To display your contact information, just style the xml with css and/or xslt and I'm sure you know how to display that ;-)
What you do get with a good RDBMS is a transaction handler and logging and some concurrency control, but I don't see why those things could not be provided by other stores.
Posted by: tobega on July 18, 2006 at 11:51 PM
-
If it's added to the JDK and not the JRE it will lead to massive confusion among especially newbies who get all kinds of errors because now their programs no longer work on other peoples' machines...
And they'll be pretty much the only ones using it as everyone else already has access to full blown RDBMSs or 3rd party embedded databases.
In other words it's just another trinket to please the marketing people that has no real value at all as without a version on the JRE it would be utterly useless for deployment and thus only have potential for those people who never deploy anything AND have no access to real database systems during development.
Posted by: jwenting on July 19, 2006 at 01:55 AM
-
jwenting: I'm confused. Why wouldn't a developer who builds a program using the JavaDB that came with their JDK simply bundle the javadb.jar with their app, just like any other jar they decide to include. Surely someone who is building a deployable app would actually test it a few times on other peoples machines and discover if they happen to forget a jar.
Posted by: joshy on July 19, 2006 at 08:08 AM
-
rickcarson: I haven't used core data but hopefully I'll get to learn more about it at WWDC next month.
Posted by: joshy on July 19, 2006 at 08:09 AM
-
Do We Need Databases on the Desktop?
YES
Do We Need Databases in the SDK or JRE?
NO
The distribution format of Java's runtime environment & SDK is at fault here. John Reynolds had an excellent idea in his blog - modularize the JRE. It is my belief that a module based type installation should apply to the SDK as well.
There doesn't seem to be any rhymre or reason as to what APIs go in the core SDK bundle. Web service? Popular, but no. Sound? Sure! Even though 95% of Java applications don't use sound. 99% of Java applications won't use the database API to be bundled in the SDK. Why include extra clutter of APIs, extra space, extra documentation, extra confusion to the developer community, to... make 1% of the community happy? Doesn't make sense.
What is J2EE? What makes something enterprise? I can't use javamail in a non-enterprise setting? Why isn't web services part of the J2EE bundle? What if I want to use JAXB in a Swing application, why is it bundled with Web Services?
This is why modularization like John says make sense. One big fat distribution, or maybe a net installer. As developers, we can handle understanding that type of installation. You could pick and choose what you need. No extra bloat. Heck, forget net installer, a modular installer could work offline and would require every API under the sun for installation - we could make smaller installation software distributions, custom distributions. Modularization is Rome.
Posted by: phlogistic on July 19, 2006 at 05:44 PM
-
errr correction - a modular installer could work offline and would NOT require every API under the sun for installation
Posted by: phlogistic on July 19, 2006 at 05:46 PM
-
Is coredata anything like JCR (Java Content Repository)? http://www.jcp.org/en/jsr/detail?id=170 JCR which is another alternative to databases for some apps interested in opening their data access.
Posted by: osbald on July 20, 2006 at 02:40 AM
|