Skip to main content

Spell Checking algorithms

Posted by daniel on September 23, 2004 at 12:19 PM EDT

What word did you mean to use?

Remember when spell checkers first started appearing in consumer applications. We all had a lot of fun with the wacky suggestions that we would get. You would leave out the space in "a business" and the suggestion would come up "Did you mean 'abysmal'."

How would you implement a spell checker? How would you create lists of words that the author may have meant to use? In Can't beat Jazzy, Tom White discusses two types of algorithms: phonetic based and string similarity based. He describes how the Aspell algorithm takes combines the two techniques to one that is used in Jazzy.

These algorithms are based around western alphabets and, in the cases given in the article, on English pronunciations. Never-the-less, it's really interesting to see how you might tag words as candidates for sounding alike. The second method of string similarity is based on how many changes you have to make to turn one word into another. This provides another measure of distance. Combining the two techniques and tuning allows you to locate more likely replacement words.

Also in Also in Java Today , Leon Messerschmidt's JavaWorld article on Coefficient describes "an extensible Java platform for online collaboration software [that] can be run either in an EJB (Enterprise JavaBeans) server or as a standalone servlet." After describing how to get started with Coefficient, the author shows you how to add a custom wiki module. More useful modules will need to be tied to a database. Coefficient handles the database communication through Hibernate. Coefficient is targetted at those who want to roll your own collaboration tools within a framework.


Jack Shirazi writes that he has been a Java developer "for 9 years, and written dozens of personal Java apps. And enjoy it all the time." In today's Weblogs, Jack reports on Java and coolness and reports on a discussion with a guy who says that he only writes in Java "to pay the bills. Never under any circumstances have I written a personal application in Java. I feel I fall into the category of one who thinks Java is woefully uncool, and knows intimately why."

Joshua Marinacci continues his exploration of tiny applications in New MiniApp:Storm Drain. He reports "While playing around some more with this miniapp idea, I came across geographer Tyler Mitchell's weblog post about hurricane tracking using Web Map Service urls. I thought this would make an interesting MiniApp and give me a good opportunity to play with a few webservices. Starting from his base (and with some greatly appreciated clarification emails from Tyler), I've created StormDrain, a simple program that loads WMS data and displays it graphically."


In Projects and Communities , the JavaPedia page Applications has recently been refactored. Add your production quality Java application. Sub-sections include desktop, internet, multimedia, development, and GUI apps.

The Java Communications community project OpenIM is developing "a fast, simple, and highly efficient instant messager server" using the Jabber protocol. It works with multiple Jabber clients, including GAIM.


The discussion of Hackers and Painters returns in today's Forums. John Mitchell asks "How much does taste really matter in the software industry? Do you evaluate your code in terms of taste (or "smell" or "style")? [..] Graham ended with: "The recipe for great work is: very exacting taste, plus the ability to gratify it." Is there any great work being done in software?"

Yishai adds to the discussion on forking and releasing the TCK saying "What you can do is put a 'You cannot use the Java brand or name in your marketing without express permission from Sun' and then tie the permission to passing the TCK and whatever other non-free conditions Sun would like. So you could fork it, but you couldn't call it Java, or claim that it runs like Java which would immediately give it a marketing problem.


In today's java.net News Headlines :

Registered users can submit news items for the java.net News Page using our news submission form. All submissions go through an editorial review before being posted to the site. You can also subscribe to thejava.net News RSS feed.


Current and upcoming Java Events :

Registered users can submit event listings for the java.net Events Page using our events submission form. All submissions go through an editorial review before being posted to the site.


Archives and Subscriptions: This blog is delivered weekdays as the Java Today RSS feed. Also, once this page is no longer featured as the front page of java.net it will be archived along with other past issues in the java.net Archive.