The Source for Java Technology Collaboration
User: Password:



Tom White's Blog

Tom White Tom White is a committer on the Apache Hadoop project, and a member of the Lucene Project Management Committee. He works as an independent consultant specializing in Hadoop and distributed computing. He has been writing Java full time since 1996, and writing about Java since 2003 for O'Reilly, java.net and IBM's developerWorks. Outside programming Tom enjoys making his daughters laugh, and watching 1930s Hollywood films.



June 2008
Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          


Search this blog:
  

Categories
Distributed
J2EE
J2SE
Open Source
Programming
Testing
Tools
Web Services and XML
Archives

March 2008
November 2007
July 2007
June 2007
April 2007
February 2007
December 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
March 2006
February 2006
October 2005
September 2005
July 2005
May 2005
April 2005
March 2005
February 2005

Recent Entries

"Disks have become tapes"

Consistent Hashing

Hadoop + EC2 + S3

Articles

Introduction to Nutch, Part 2: Searching
In the second part of this look at the Nutch web indexing and search engine, Tom White looks at how to perform searches on the index generated in part one's crawl, and shows how to integrate Nutch's search capabilities with your applications through direct Java calls to its API or via the OpenSearch API. Feb. 16, 2006

Introduction to Nutch, Part 1: Crawling
Do you need your own search engine, when the world already has Google? Quite possibly so: you may belong to an organization with enough of its own contents that you want to manage and run your own search engine--and know how it works. Nutch is an open source search engine written in Java. In this article, Tom White shows how it crawls pages to build its index. Jan. 10, 2006

Did You Mean: Lucene?
All modern search engines attempt to detect and correct spelling errors in users' search queries. This article shows you one way of adding a "did you mean" suggestion facility to your own search applications using the Lucene Spell Checker. Aug. 9, 2005

All articles by Tom White »



Powered by
Movable Type 3.01D


 Feed java.net RSS Feeds