 |
Article:
 |
 |
Introduction to Nutch, Part 1: Crawling
|
| Subject: |
help please |
| Date: |
2007-10-08 09:02:10 |
| From: |
kayosoufiane |
|
|

|
in the crawling process i get this in the log,
have forgoten something? i felowed the help from the apache nutch site.
here is the log:
2007-10-08 15:37:21,828 INFO indexer.Indexer - Optimizing index.
2007-10-08 15:37:22,828 INFO indexer.Indexer - Indexer: done
2007-10-08 15:37:22,828 INFO indexer.DeleteDuplicates - Dedup: starting
2007-10-08 15:37:22,843 INFO indexer.DeleteDuplicates - Dedup: adding indexes in: crawl/indexes
2007-10-08 15:37:23,406 WARN mapred.LocalJobRunner - job_e9fzvu
java.lang.ArrayIndexOutOfBoundsException: -1
at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java:113)
at org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next(DeleteDuplicates.java:176)
at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:126) |
|