Elasticsearch和RestHighLevelClient的使用
Posted cakeng
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Elasticsearch和RestHighLevelClient的使用相关的知识,希望对你有一定的参考价值。
文章目录
概述
文档说明
- 以下所有的都基于ElasticSearch 7.x
- 以下所有的案例都建立在
index_learn_test
索引上 - 索引DSL
- 所有代码案例都基于java的RestHighLevelClient编写
"index_learn_test" :
"mappings" :
"properties" :
"age" :
"type" : "keyword",
"fields" :
"number" :
"type" : "integer"
,
"departmentId" :
"type" : "keyword"
,
"departmentIdLeve1" :
"type" : "keyword"
,
"departmentIdLeve2" :
"type" : "keyword"
,
"departmentIdLeve3" :
"type" : "keyword"
,
"departmentIdLeve4" :
"type" : "keyword"
,
"departmentIdLeve5" :
"type" : "keyword"
,
"departmentIdLeve6" :
"type" : "keyword"
,
"departmentIdLeve7" :
"type" : "keyword"
,
"departmentIds" :
"type" : "keyword"
,
"departmentJoin" :
"type" : "join",
"eager_global_ordinals" : true,
"relations" :
"department" : "user"
,
"id" :
"type" : "long"
,
"name" :
"type" : "text",
"fields" :
"keyword" :
"type" : "keyword"
,
"resume" :
"type" : "wildcard"
,
"sex" :
"type" : "keyword"
字段类型
索引
遍历所有索引并查看索引占用空间
GET /_cat/indices?v
查看某个索引的配置(包含默认配置)
GET /index_learn_test/_settings?include_defaults=true
创建索引
PUT /index_learn_test
"mappings":
"properties":
"id":
"type": "long"
,
"name":
"type": "text",
"fields":
"keyword":
"type": "keyword"
,
"sex":
"type": "keyword"
,
"age":
"type": "keyword",
"fields":
"number":
"type": "integer"
这里面的
fields
是给字段设置别的类型,使用的时候以名字为例name.keyword
即可
查看索引结构
GET /index_learn_test/_mapping
删除索引
DELETE /index_learn_test
新增索引字段
PUT /index_learn_test/_mapping
"properties":
"departmentIds":
"type":"keyword"
复制索引数据
"dest":
"index": "index_learn_test2"
,
"source":
"query":
"bool":
"must": [
"term":
"name":
"value": "正"
]
,
"index": "index_learn_test"
,
"max_docs":1
- dest:目标索引
- source:数据源
- query:数据筛选
- max_docs:最大复制文档数量
增删改
更新后立即生效
在ES中所有更新都是延迟生效的,默认是
1s
,如果需要更新后立即生效,参考以下java
代码。
查看延迟时间GET /index_learn_test/_settings?include_defaults=true
返回的refresh_interval
设置
java:
UpdateRequest updateRequest = new UpdateRequest(getIndex(),userData.getId().toString())
//ES更新后会有延迟,延迟根据refresh_interval设置的,所以这边配置强制更新
.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE)
新增(insert)
es:
PUT /index_learn_test/_doc/$id
PUT /index_learn_test/_doc/12
"id": 12,
"age": 42,
"sex": "女",
"name": "厍振",
"resume": "我是厍振,大家好!",
"departmentId": "A"
java代码:
public void insert() throws IOException
List<DepartmentData> list = DepartmentUtil.getDepartment(DOC_PARENT_NAME);
String[] sex = new String[]"男", "女";
UserData userData = new UserData();
userData.setId(12L);
userData.setAge(RandomUtil.randomInt(19, 60));
userData.setSex(sex[RandomUtil.randomInt(2)]);
userData.setName(RandNameUtil.randName());
userData.setResume("我是" + userData.getName() + ",大家好!");
userData.setDepartmentId(list.get(RandomUtil.randomInt(list.size())).getDepartmentId());
IndexRequest indexRequest = new IndexRequest(getIndex())
.id(userData.getId().toString()).source(JsonUtils.toJsonString(userData), XContentType.JSON);
restHighLevelClient.index(indexRequest,RequestOptions.DEFAULT);
修改(update)
可以直接使用
新增
进行全部替换,或者使用以下代码修改部分替换
es:
POST /index_learn_test/_update/12?retry_on_conflict=10
"doc":
"age":12
retry_on_conflict
允许重试次数(update在并发的情况下各个线程哪都的version可能不同导致更新失败)
java代码:
public void update() throws IOException
UserData userData = new UserData();
userData.setId(12L);
userData.setAge(RandomUtil.randomInt(19, 60));
UpdateRequest updateRequest = new UpdateRequest(getIndex(),userData.getId().toString())
//ES更新后会有延迟,延迟根据refresh_interval设置的,所以这边配置强制更新
.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE)
.retryOnConflict(10)
.doc(JsonUtils.toJsonString(userData), XContentType.JSON);
restHighLevelClient.update(updateRequest,RequestOptions.DEFAULT);
删除(delete)
DELETE /index_learn_test/_doc/$id
es:
DELETE /index_learn_test/_doc/12
java代码:
public void delete() throws IOException
UserData userData = new UserData();
userData.setId(12L);
DeleteRequest request = new DeleteRequest(getIndex(),userData.getId().toString())
.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);
restHighLevelClient.delete(request,RequestOptions.DEFAULT);
批处理(bulk)
es:
PUT /index_learn_test/_bulk
"delete":"_index":"index_learn_test","_id":"12"
"update":"_index":"index_learn_test","_id":"20"
"doc":"age":23
"create":"_index":"index_learn_test","_id":"30"
"age":34,"id":30
"index":"_index":"index_learn_test","_id":"40"
"age":34,"id":30
java代码:
public void bulk() throws IOException
BulkRequest bulkRequest = new BulkRequest();
DeleteRequest deleteRequest = new DeleteRequest(getIndex()).id("12");
bulkRequest.add(deleteRequest);
UpdateRequest updateRequest = new UpdateRequest(getIndex(), "20").doc(Collections.singletonMap("age", 30));
bulkRequest.add(updateRequest);
//…… 其他的省略
restHighLevelClient.bulk(bulkRequest,RequestOptions.DEFAULT);
查询
查询条件的java代码
所有的查询方法基本上都可以套用以下代码
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
// 所有的查询条件基本都可以通过QueryBuilders类构建
QueryBuilder wildcardQueryBuilder = QueryBuilders.wildcardQuery("resume", "*我是王*,大家*"));
searchRequest.source(SearchSourceBuilder
.searchSource()
.query(wildcardQueryBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
log.info("查询结果:",JsonUtils.toJsonString(searchResponsegetHits().getHits()));
算分
文档匹配的相关度,主要用于排序
耗时
查新结果返回参数
took
,单位ms
返回查询结果总条数
如非必要不要使用,具体原因看以下说明。
实测单片5G数据量的情况下,普通查询影响并不大,大概在20ms
。
父子文档的查询影响比较大,大概在800ms
- 当值为
true
时返回总数,需要访问所有文档。效率最低 - 当值为
>= 0
时返回总数,总数超过则按照设置的值返回,且最大值为2147483647
。仅需要访问设置的参数的文档数,效率根据设置的值做参考 - 当值为
= -1
时不返回总数,效率高
es:
GET /index_learn_test/_search
"track_total_hits": true
java代码:
SearchSourceBuilder searchSourceBuilder = SearchSourceBuilder.searchSource();
searchSourceBuilder.trackTotalHits(true);
返回部分字段
includes
只返回这些字段,excludes
除了这些字段都返回。当两个一起使用时是and
的关系
es:
GET /index_learn_test/_search
"_source":
"includes": [
"name",
"age"
],
"excludes": [
"name"
]
java代码:
@Test
public void _source() throws IOException
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource().fetchSource(new String[]"name","age", new String[]"name"));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
排序
以SQL为案例,先根据
年龄
降序,然后再根据性别
升序
SQL:
select * from index_learn_test order by age desc, sex asc
es:
GET /index_learn_test/_search
"sort": [
"age":
"order": "desc"
,
"sex":
"order": "asc"
]
java代码:
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.sort("age",SortOrder.DESC)
.sort("sex",SortOrder.ASC));
精确搜索
根据文档id单条查询
GET /index_learn_test/_doc/$id
GET /index_learn_test/_doc/20
根据文档id批量查询
es:
GET /index_learn_test/_search
"query":
"ids":
"values": [
"35",
"333"
]
单条精确term(算分)
类似于
mysql
的=
es:
GET /index_learn_test/_search
"query":
"term":
"name.keyword":
"value": "王正年2"
多条精确terms(算分)
类似于
MySQL
的in
es:
GET /index_learn_test/_search
"query":
"terms":
"age": [
26,
27
]
模糊查询
wildcard(算分)
类似于
MySQL
的like
该方法需要将字段定义成wildcard
类型
es:
GET /index_learn_test/_search
"query":
"wildcard":
"resume":
"wildcard": "*我是王*,大家*"
java代码:
public void wildcard() throws IOException
SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
.searchSource()
.query(QueryBuilders.wildcardQuery("resume", "*我是王*,大家*")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
log.info("查询结果:", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
match(算分)
基于分词的查询搜索,如果需要根据短语搜索请使用
match_parse
es:
GET /index_learn_test/_search
"query":
"match":
"introduce": "齐,今年"
java代码:
@Test
public void matchElasticsearch- elasticsearch索引的创建查询和删除
windows下安装elasticsearch和elasticsearch-head