elasticsearch操作
Posted 骑台风走
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了elasticsearch操作相关的知识,希望对你有一定的参考价值。
1.倒排索引的介绍
1 倒排索引:对文章进行分词,对每个词建立索引,
由于这样建,会出现索引爆炸,索引索引跟标题建关系,标题再跟文章建索引,如下:
分词---文章建立索引 |
| 今天(索引) | (文章1,<2,10>,2) (文章3,<8>,1) |
| 星期天(索引) | (文章2,<12,25,100>,3) |
| 出去玩(索引) | (文章5,<11,24,89>,3)(文章1,<8,19>,2) |
今天出现在哪个文章,出现的位置和出现的次数
2.索引操作(数据库)
2.1 创建索引
PUT ymq
"settings":
"index":
"number_of_shards":5,
"number_of_replicas":1
2.2 查看索引
# 查看单个
GET ymq/_settings
# 查看所有
GET _all/_settings
# 查看特定
GET ymq,ymq2/_settings
# 查看所有
GET _settings
2.3 修改索引(一般不太用,只能用来修改副本数量)
#修改索引副本数量为2 分片的数量一开始就要定好
# 副本数量可以改(有可能会出错)
PUT ymq/_settings
"number_of_replicas": 2
PUT _all/_settings
"index":
"blocks":
"read_only_allow_delete": false
2.4 删除索引
DELETE ymq
3. 映射管理(类型)(表)
3.1 介绍
在Elasticsearch 6.0.0或更高版本中创建的索引只包含一个mapping type。
在5.x中使用multiple mapping types创建的索引将继续像以前一样在Elasticsearch 6.x中运行。 Mapping types将在Elasticsearch 7.0.0中完全删除
##索引如果不创建,只有插入文档,会自动创建
3.2 创建映射(类型,表)
PUT books
"mappings":
"properties":
"title":
"type":"text"
,
"price":
"type":"integer"
,
"addr":
"type":"keyword"
,
"company":
"properties":
"name":"type":"text",
"company_addr":"type":"text",
"employee_count":"type":"integer"
,
"publish_date":"type":"date","format":"yyy-MM-dd"
3.3 查看映射
GET books/_mapping
GET _all/_mapping
3.4 特殊说明索引映射都不存在,也可以插入文档
PUT ymq2/_doc/1
"title":"白雪公主和十个小矮人",
"price":"99",
"addr":"黑暗森里",
"publish_date":"2018-05-19",
"name":"ymq"
4. 文档基本增删查改(一行一行数据)
4.1 插入文档
PUT books/_doc/1
"title":"大头儿子小偷爸爸",
"price":100,
"addr":"北京天安门",
"company":
"name":"我爱北京天安门",
"company_addr":"我的家在东北松花江傻姑娘",
"employee_count":10
,
"publish_date":"2019-08-19"
PUT books/_doc/2
"title":"白雪公主和十个小矮人",
"price":"99",
"addr":"黑暗森里",
"publish_date":"2018-05-19"
PUT books/_doc/3
"title":"白雪公主和十个小矮人",
"price":"99",
"addr":"黑暗森里",
"publish_date":"2018-05-19",
"name":"lqz"
4.2 查看文档
# 格式:索引名称/默认类型名称/id
GET books/_doc/1
4.3 修改文档两种方式
4.3.1 第一种(不推荐,全部修改)
PUT lqz/_doc/1
"name":"顾老二",
"age":30,
"from": "gu",
"desc": "皮肤黑、武器长、性格直",
"tags": ["黑", "长", "直"]
4.3.2 局部修改
POST lqz/_doc/1/_update
"doc":
"desc": "皮肤很safasdfsda黄,武器很长,性格很直",
"tags": ["很黄","很长", "很直"]
4.4 删除文档
DELETE lqz/_doc/4
5. 文档查询
5.1 term与match的区别
5.1.1 介绍
term:是代表完全匹配,也就是精确查询,搜索前不会再对搜索词进行分词,所以我们的搜索词必须是文档分词集合中的一个
match:查询会先对搜索词进行分词,分词完毕后再逐个对分词结果进行匹配,因此相比于term的精确搜索,match是分词匹配搜索
5.1.2 创建索引+映射(无ik)+插入数据
# 创建索引跟映射
PUT lqz
"settings":
"number_of_shards": 5,
"number_of_replicas": 2
,
"mappings":
"properties":
"title":
"type":"text"
,
"desc":
"type":"text"
,
"price":
"type":"integer"
,
"addr":
"type":"keyword"
,
"company":
"properties":
"name":"type":"text",
"company_addr":"type":"text",
"employee_count":"type":"integer"
,
"publish_date":"type":"date","format":"yyy-MM-dd"
# 插入数据
PUT lqz/_doc/1
"title":"so beautiful zero",
"price":100,
"addr":"北京天安门",
"desc":"beautiful cat",
"company":
"name":"我爱北京天安门",
"company_addr":"我的家在东北松花江傻姑娘",
"employee_count":10
,
"publish_date":"2019-08-19"
PUT lqz/_doc/2
"title":"so beautiful one",
"price":200,
"addr":"北京天安门",
"desc":"beautiful dog",
"company":
"name":"我爱北京天安门",
"company_addr":"我的家在东北松花江傻姑娘",
"employee_count":10
,
"publish_date":"2019-08-19"
PUT lqz/_doc/3
"title":"so beautiful tow",
"price":698,
"addr":"北京天安门",
"desc":"dog",
"company":
"name":"我爱北京天安门",
"company_addr":"我的家在东北松花江傻姑娘",
"employee_count":10
,
"publish_date":"2019-08-19"
5.2 term
5.2.1 term与terms
term:不会分词,按照指定的词查询
terms:可指定多个词查询
# term查的不会分词
GET lqz/_doc/_search
"query":
"term":
"desc": "beautiful"
# terms由于部分词,想查多个,terms
GET lqz/_doc/_search
"query":
"terms":
"title": ["beautiful", "so"]
5.3 match
5.3.1 match和match_all
match:查询相当于模糊匹配,只包含其中一部分关键词就行
match_all:能够匹配索引中的所有文件。
match_phrase:短语匹配查询,要求必须全部精确匹配,且顺序必须与指定的短语相同
# match查的短语会分词
GET lqz/_doc/_search
"query":
"match_all":
GET lqz/_doc/_search
"query":
"match":
"title": "beautiful tow"
5.4 排序查询
不是所有字段都支持排序,只有数字类型,字符串不支持
# 排序查询
# 1.普通查询
GET lqz/_doc/_search
"query":
"match":
"addr": "北京天安门"
# 2.降序
GET lqz/_doc/_search
"query":
"match":
"addr": "北京天安门"
,
"sort": [
"price":
"order": "desc"
]
#3.升序
GET lqz/_doc/_search
"query":
"match":
"addr": "北京天安门"
,
"sort": [
"price":
"order": "asc"
]
# 4.match_all+升序
GET lqz/_doc/_search
"query":
"match_all":
,
"sort": [
"price":
"order": "asc"
]
5.5 分页查询
所有的条件都是可插拔的,彼此之间用 , 分割
# 分页
#从第二条开始,取一条
GET lqz/_doc/_search
"query":
"match_all":
,
"sort": [
"price":
"order": "desc"
],
"from": 2,
"size": 2
###注意:对于`elasticsearch`来说,所有的条件都是可插拔的,彼此之间 , 分割
GET lqz/_doc/_search
"query":
"match_all":
,
"from": 2,
"size": 2
5.6 布尔查询
-
must
:与关系,相当于关系型数据库中的and
。 -
should
:或关系,相当于关系型数据库中的or
。 -
must_not
:非关系,相当于关系型数据库中的not
。 -
filter
:过滤条件。 -
range
:条件筛选范围。 -
gt
:大于,相当于关系型数据库中的>
。 -
gte
:大于等于,相当于关系型数据库中的>=
。 -
lt
:小于,相当于关系型数据库中的<
。 -
lte
:小于等于,相当于关系型数据库中的<=
。
##布尔查询之should or条件
GET lqz/_doc/_search
"query":
"bool":
"should": [
"match":
"addr": "北京天安门"
,
"match":
"desc": "beautiful"
]
### must_not条件 都不是
GET lqz/_doc/_search
"query":
"bool":
"must_not": [
"match":
"addr": "北京天安门"
,
"match":
"desc": "beautiful"
,
"match":
"price": 698
]
###filter,大于小于的条件 gt lt gte lte
GET lqz/_doc/_search
"query":
"bool":
"must": [
"match":
"addr": "北京天安门"
],
"filter":
"range":
"price":
"lt": 200
### 范围查询
GET lqz/_doc/_search
"query":
"bool":
"must": [
"match":
"addr": "北京天安门"
],
"filter":
"range":
"price":
"gte": 100,
"lte": 150
5.7 查询结果过滤
###基本使用
GET lqz/_doc/_search
"query":
"match_all":
,
"_source":["name","age"]
####_source和query是平级的
GET lqz/_doc/_search
"query":
"bool":
"must":
"match":"from":"gu"
,
"filter":
"range":
"age":
"lte": 25
,
"_source":["name","age"]
5.8 高亮查询(未能高亮)
GET lqz/_doc/_search
"query":
"match":
"price": "698"
,
"highlight":
"pre_tags": "<b class='key' style='color:red'>",
"post_tags": "</b>",
"fields":
"from":
5.9 聚合函数
# sum ,avg, max ,min
# select max(age) as my_avg from 表 where from=gu;
GET lqz/_doc/_search
"query":
"match":
"from": "gu"
,
"aggs":
"my_avg":
"avg":
"field": "age"
,
"_source": ["name", "age"]
#最大年龄
GET lqz/_doc/_search
"query":
"match":
"from": "gu"
,
"aggs":
"my_max":
"max":
"field": "age"
,
"_source": ["name", "age"]
#最小年龄
GET lqz/_doc/_search
"query":
"match":
"from": "gu"
,
"aggs":
"my_min":
"min":
"field": "age"
,
"_source": ["name", "age"]
# 总年龄
#最小年龄
GET lqz/_doc/_search
"query":
"match":
"from": "gu"
,
"aggs":
"my_sum":
"sum":
"field": "age"
,
"_source": ["name", "age"]
#分组
# 现在我想要查询所有人的年龄段,并且按照`15~20,20~25,25~30`分组,并且算出每组的平均年龄。
GET lqz/_doc/_search
"size": 0,
"query":
"match_all":
,
"aggs":
"age_group":
"range":
"field": "age",
"ranges": [
"from": 15,
"to": 20
,
"from": 20,
"to": 25
,
"from": 25,
"to": 30
]
以上是关于elasticsearch操作的主要内容,如果未能解决你的问题,请参考以下文章