基础篇 | Solr快速入门

Posted 5ithink

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了基础篇 | Solr快速入门相关的知识,希望对你有一定的参考价值。

        Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world's largest internet sites.

Index Techproducts Example Data


1.start Solr as a two-node cluster

wget https://mirrors.tuna.tsinghua.edu.cn/apache/lucene/solr/7.4.0/solr-7.4.0.tgz

b.解压

unpack solr-7.4.0

tar xvf solr-7.4.0.tgz

c.启动

cd solr-7.4.0

./bin/solr start -force -e cloud 

e.访问Solr Admin UI

http://192.168.1.202:8983/solr/#/

基础篇 | Solr快速入门

2.Index the Techproducts Data

a.create collection

bin/post -c techproducts example/exampledocs/*

基础篇 | Solr快速入门

b.Basic Searching

基础篇 | Solr快速入门

1)facet:按特征词分组统计查询

curl "http://localhost:8983/solr/techproducts/select?indent=on&q=*:*"

*:* requests all documents

2)Search for a Single Term

curl http://192.168.1.202:8983/solr/techproducts/select?fl=id&q=*:*

3)Field Searches

http://192.168.1.202:8983/solr/techproducts/select?q=cat:electronics

4)Phrase Search

  • search for a multi-term phrase

curl "http://localhost:8983/solr/techproducts/select?q=\"CAS+latency\""

  • Combining Searches(+/-)

curl "http://localhost:8983/solr/techproducts/select?q=%2Belectronics%20%2Bmusic"

Modify the Schema and Index Films Data


1.creat collection

bin/solr create -force -c films -s 2 -rf 2

基础篇 | Solr快速入门

2.Preparing Schemaless for the Films Data

Create the "names" Field
Create a "catchall" Copy Field

避免因类型问题导致创建索引失败,可以优先创建Field[参考:example/films/README.txt]

基础篇 | Solr快速入门

3.导入数据

bin/post -c films example/films/films.json

基础篇 | Solr快速入门

4.Faceting

        One of Solr’s most popular features is faceting. Faceting allows the search results to be arranged into subsets (or buckets, or categories), providing a count for each subset. There are several types of faceting: field values, numeric and date ranges, pivots (decision tree), and arbitrary query faceting.

http://192.168.1.202:8983/solr/films/browse?facet.field=genre

基础篇 | Solr快速入门

a.Field Facets

        In addition to providing search results, a Solr query can return the number of documents that contain each unique value in the whole result set.

curl "http://192.168.1.202:8983/solr/films/select?q=*:*&rows=0&facet=true&facet.field=genre_str"

基础篇 | Solr快速入门

b.Range Facets

http://192.168.1.202:8983/solr/techproducts/select?facet.field=price&facet=on&q=*:*&rows=0

基础篇 | Solr快速入门

http://192.168.1.202:8983/solr/films/select?facet.field=initial_release_date&facet=on&q=*:*&rows=0

基础篇 | Solr快速入门

curl 'http://192.168.1.202:8983/solr/films/select?q=*:*&rows=0&facet=true&facet.range=initial_release_date&facet.range.start=NOW-20YEAR&facet.range.end=NOW&facet.range.gap=%2B1YEAR'基础篇 | Solr快速入门

c.Pivot Facets

        Another faceting type is pivot facets, also known as "decision trees", allowing two or more fields to be nested for all the various possible combinations.Using the films data, pivot facets can be used to see how many of the films in the "Drama" category (the genre_str field) are directed by a director. Here’s how to get at the raw data for this scenario:

curl "http://192.168.1.202:8983/solr/films/select?q=*:*&rows=0&facet=on&facet.pivot=genre_str,directed_by_str"

基础篇 | Solr快速入门

DIH[data import handler]


1.创建collection

bin/solr create -force -c ring -s 2 -rf 2

2.DataImportHandler

a.下载mysql-connector-java-5.0.4.jar放在solr-7.4.0/server/lib/目录下

拷贝:

solr-dataimporthandler-7.4.0.jar

solr-dataimporthandler-extras-7.4.0.jar

放在:

solr-7.4.0/server/lib

b.vi solr-7.4.0/server/solr/configsets/_default/conf/solrconfig.xml

<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">

    <lst name="defaults">

        <str name="config">db-data-config.xml</str>

    </lst>

</requestHandler>

c.vi solr-7.4.0/server/solr/configsets/_default/conf/managed-schema

基础篇 | Solr快速入门

d.vi solr-7.4.0/server/solr/configsets/_default/conf/db-data-config.xml

基础篇 | Solr快速入门

e.index

基础篇 | Solr快速入门

f.query

基础篇 | Solr快速入门


[参数解释]

  • Request-Handle:请求处理器

  • q:查询参数

  • fq:过滤查询参数

  • sort:指定排序参数和方式

  • start,row:搜素起始位置和纪录条数

  • fl:返回纪录指定Field字段值

  • wt:响应输出类型

3.运行solr

a).delete collection

bin/solr delete -c techproducts

b).create collection:

bin/solr create -c <yourCollection> -s 2 -rf 2

c).stop solr nodes

bin/solr stop -all

d).restart solr

./bin/solr start -force -c -p 8983 -s example/cloud/node1/solr

./bin/solr start -force -c -p 7574 -s example/cloud/node2/solr -z localhost:9983

Using SolrJ


1.import依赖jar

<dependency>
  <groupId>
junit</groupId>
  <artifactId>
junit</artifactId>
  <version>
4.12</version>
  <scope>
test</scope>
</dependency>
<dependency>
  <groupId>
org.apache.solr</groupId>
  <artifactId>
solr-solrj</artifactId>
  <version>
7.4.0</version>
</dependency>

2.creat SolrClient

基础篇 | Solr快速入门

3.Querying in SolrJ

基础篇 | Solr快速入门

基础篇 | Solr快速入门

基础篇 | Solr快速入门

基础篇 | Solr快速入门

基础篇 | Solr快速入门

基础篇 | Solr快速入门

基础篇 | Solr快速入门

基础篇 | Solr快速入门

4.Indexing in SolrJ

基础篇 | Solr快速入门

基础篇 | Solr快速入门

5.Updating in SolrJ

6.Deleting in SolrJ

7.TestSolrHandle.java

package com.think.hsd.portal.search;

import
org.apache.solr.client.solrj.SolrClient;
import
org.apache.solr.client.solrj.SolrQuery;
import
org.apache.solr.client.solrj.SolrServerException;
import
org.apache.solr.client.solrj.beans.Field;
import
org.apache.solr.client.solrj.impl.HttpSolrClient;
import
org.apache.solr.client.solrj.response.QueryResponse;
import
org.apache.solr.client.solrj.response.UpdateResponse;
import
org.apache.solr.common.SolrDocument;
import
org.apache.solr.common.SolrDocumentList;
import
org.apache.solr.common.SolrInputDocument;
import
org.apache.solr.common.params.MapSolrParams;
import
org.junit.Test;

import
java.io.IOException;
import
java.util.*;

/**
* Created by think on 2018/9/5.
*/
public class TestSolrHandle {

public static class TechProduct {
@Field
       
public String id;
       
@Field
       
public List<String> name;

       public
TechProduct(String id, List<String> name) {
this.id = id;
           this
.name = name;
       
}
public TechProduct() {
}
}
private final String solrUrl = "http://192.168.1.202:8983/solr";
   private final
String collectionName = "techproducts";
   private final
SolrClient httpSolrClient = new HttpSolrClient.Builder(solrUrl)
.withConnectionTimeout(10000)
.withSocketTimeout(60000)
.build();
   
/**
    * add方法添加索引
    *
@throws IOException
    *
@throws SolrServerException
    */
   
@Test
   
public void creatIndex() throws IOException, SolrServerException {

final SolrInputDocument doc = new SolrInputDocument();
       
doc.addField("id", UUID.randomUUID().toString());
       
doc.addField("name", "Amazon Kindle Paperwhite");
       
UpdateResponse updateResponse = httpSolrClient.add(collectionName, doc);
       
httpSolrClient.commit(collectionName);
   
}
/**
    * Java Object Binding:添加索引
    *
@throws IOException
    *
@throws SolrServerException
    */
   
@Test
   
public void addIndex() throws IOException, SolrServerException {
List<String> name = new ArrayList<String>() ;
       
name.add("book Amazon Kindle Paperwhite") ;
       
TechProduct techProduct = new TechProduct("kindle-id-4", name);
       
UpdateResponse response = httpSolrClient.addBean(collectionName, techProduct);
       
httpSolrClient.commit(collectionName);
   
}
/**
    * 修改等同saveOrupdate
    *
@throws IOException
    *
@throws SolrServerException
    */
   
@Test
   
public void updateIndex() throws IOException, SolrServerException {

SolrInputDocument doc = new SolrInputDocument();
       
doc.addField("id", UUID.randomUUID().toString());
       
doc.addField("name", "Amazon Kindle Paperwhite2");

       
UpdateResponse updateResponse = httpSolrClient.add(collectionName, doc);
       
httpSolrClient.commit(collectionName);
   
}
/**
    * Java Object Binding:获取数据
    *
@throws IOException
    *
@throws SolrServerException
    */
   
@Test
   
public void queryIndex1() throws IOException, SolrServerException {
SolrQuery query = new SolrQuery("*:*");
       
query.addField("id");
       
query.addField("name");
       
query.setSort("id", SolrQuery.ORDER.asc);
       
query.setRows(10);
       
QueryResponse queryResponse = httpSolrClient.query(collectionName,query);
       
List<TechProduct> products = queryResponse.getBeans(TechProduct.class);
       for
(TechProduct techProduct : products) {
System.out.println("id: " + techProduct.id + "; name: " + techProduct.name);
       
}
   
}
@Test
   
public void queryIndex2() throws IOException, SolrServerException {
SolrQuery query = new SolrQuery("*:*");
       
query.addField("id");
       
query.addField("name");
       
query.setSort("id", SolrQuery.ORDER.asc);
       
query.setRows(20);

       
QueryResponse queryResponse = httpSolrClient.query(collectionName, query);
       
SolrDocumentList documents = queryResponse.getResults();
       
System.out.println("Found " + documents.getNumFound() + " documents");
       for
(SolrDocument document : documents) {

final String id = (String) document.getFirstValue("id");
           final
String name = (String) document.getFirstValue("name");

           
System.out.println("id: " + id + "; name: " + name);
       
}
}
/**
    * highlighting高亮显示
    *
@throws Exception
    */
   
@Test
   
public void queryByHighlight() throws Exception {

SolrQuery solrQuery = new SolrQuery();
       
solrQuery.setQuery("name:iPod");
       
solrQuery.setQuery("name:Amazon");
       
solrQuery.setStart(0);
       
solrQuery.setRows(10);
       
solrQuery.setHighlight(true);
       
solrQuery.addHighlightField("name");
       
solrQuery.setHighlightSimplePre("<span style='color:red'>");
       
solrQuery.setHighlightSimplePost("</span>");

       
QueryResponse query = httpSolrClient.query(collectionName, solrQuery);
       
SolrDocumentList results = query.getResults();
       
Map<String, Map<String, List<String>>> highlighting = query.getHighlighting();
       for
(SolrDocument solrDocument : results) {
String idStr = (String) solrDocument.get("id");
           
String nameStr = (String) solrDocument.getFirstValue("name");
           
System.out.println("id:"+idStr+"======="+"name:" + nameStr);
           
Map<String, List<String>> map = highlighting.get(idStr);
           if
(map != null) {
List<String> list = map.get("name");
               if
(list != null) {
String resultString = list.get(0);
                   
System.out.println("高亮:-----" + resultString);
               
}
}

}
}

@Test
   
public void queryIndex3() throws IOException, SolrServerException {

Map<String, String> queryParamMap = new HashMap<String, String>();
       
queryParamMap.put("q", "*:*");
       
queryParamMap.put("q", "name:Amazon");
//        queryParamMap.put("q", "id:kindle-id-4");
       
queryParamMap.put("fl", "id, name");
       
queryParamMap.put("sort", "id asc");
       
queryParamMap.put("rows", "20");
       
MapSolrParams queryParams = new MapSolrParams(queryParamMap);

       
QueryResponse response = httpSolrClient.query(collectionName, queryParams);
       
SolrDocumentList documents = response.getResults();

       
System.out.println("==Found [" + documents.getNumFound() + "] documents==");
       for
(SolrDocument document : documents) {
final String id = (String) document.getFirstValue("id");
           final
String name = (String) document.getFirstValue("name");

           
System.out.println("id: " + id + " name: " + name);
       
}
}
/**
    *删除索引
    *
@throws SolrServerException
    *
@throws IOException
    */
   
@Test
   
public void deleteIndex() throws Exception {
String id = "kindle-id-4" ;
       
httpSolrClient.deleteByQuery(collectionName,"id:"+id+"", 1000);
   
}
}

参考资料


  • http://lucene.apache.org/solr/guide/7_4/about-this-guide.html

  • http://lucene.apache.org/solr/

  • http://mirror.bit.edu.cn/apache/lucene/solr/ref-guide/apache-solr-ref-guide-7.4.pdf

  • https://mirrors.tuna.tsinghua.edu.cn/apache/lucene/solr/7.4.0/


以上是关于基础篇 | Solr快速入门的主要内容,如果未能解决你的问题,请参考以下文章

Python学习基础篇第一篇——快速入门(适合初学者)

Pandas高级数据分析快速入门之二——基础篇

java应用之solr入门篇

快速入门Linux基础+环境配置+shell脚本

JavaScript学习基础篇第1篇: JavaScript 入门

c++基础篇c++快速入门(extern c专题)