solr 5.5.0与nutch 1.13的错误集成:'连接池关闭'

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了solr 5.5.0与nutch 1.13的错误集成:'连接池关闭'相关的知识,希望对你有一定的参考价值。

当我尝试将'Solr'与'Nutch'集成时,我遇到了问题:

错误是:

Active IndexWriters :
SOLRIndexWriter
solr.server.url : URL of the SOLR instance
solr.zookeeper.hosts : URL of the Zookeeper quorum
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : username for authentication
solr.auth.password : password for authentication


Indexer: number of documents indexed, deleted, or skipped:
Indexer: finished at 2017-11-30 01:34:49, elapsed: 00:00:01
Cleaning up index if possible
apache-nutch-1.13/bin     /nutch clean -Dsolr.server.url=http://localhost:8983/solr/nutch crawling_dir/crawldb
SolrIndexer: deleting 1/1 documents
ERROR CleaningJob: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865)
at org.apache.nutch.indexer.CleaningJob.delete(CleaningJob.java:174)
at org.apache.nutch.indexer.CleaningJob.run(CleaningJob.java:197)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.indexer.CleaningJob.main(CleaningJob.java:208)

Error running:
apache-nutch-1.13/bin/nutch clean -Dsolr.server.url=http://localhost:8983/solr/nutch crawling_dir/crawldb
Failed with exit value 255.

在日志文件中:

2017-11-30 01:34:50,851 WARN  output.FileOutputCommitter - Output Path is null in cleanupJob()
2017-11-30 01:34:50,851 WARN  mapred.LocalJobRunner - job_local531807742_0001
java.lang.Exception: java.lang.IllegalStateException: Connection pool shut down
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.lang.IllegalStateException: Connection pool shut down
    at org.apache.http.util.Asserts.check(Asserts.java:34)
    at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:169)
    at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:202)
    at org.apache.http.impl.conn.PoolingClientConnectionManager.requestConnection(PoolingClientConnectionManager.java:184)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415)
    at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:481)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:240)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:229)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
    at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:482)
    at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:463)
    at org.apache.nutch.indexwriter.solr.SolrIndexWriter.commit(SolrIndexWriter.java:191)
    at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:179)
    at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:117)
    at org.apache.nutch.indexer.CleaningJob$DeleterReducer.close(CleaningJob.java:122)
    at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
    at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2017-11-30 01:34:51,458 ERROR indexer.CleaningJob - CleaningJob: java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865)
    at org.apache.nutch.indexer.CleaningJob.delete(CleaningJob.java:174)
    at org.apache.nutch.indexer.CleaningJob.run(CleaningJob.java:197)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.nutch.indexer.CleaningJob.main(CleaningJob.java:208)

请问您有什么想法吗?

答案

有同样的问题和你的问题可能是由于同样的原因https://issues.apache.org/jira/browse/NUTCH-2269

尝试修补它,错误应该消失

另一答案

根据我的发现,它似乎是一个错误。这是一个很好地解释它的博客,https://reformatcode.com/code/apache-configuration/apache-nutch-112-with-apache-solr-621-give-an-error

以上是关于solr 5.5.0与nutch 1.13的错误集成:'连接池关闭'的主要内容,如果未能解决你的问题,请参考以下文章

Solr4.10.2集成Nutch1.9与自带UI界面使用

索引 nutch 抓取的 solr 数据时出错

[Nutch]Nutch2.3+Hadoop+HBase+Solr在Ubuntu环境搭建

nutch和solr建立搜索引擎基础(单机版)

通过Nutch扩展点开发插件(添加自定义索引字段到solr)

[Nutch]Nutch+Eclipse+Tomcat+Solr+Cygwin搭建Windows开发环境