elasticsearch数据组织结构

Posted 2022-11-17 wodeboke-y

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了elasticsearch数据组织结构相关的知识，希望对你有一定的参考价值。

elasticsearch数据组织结构

1. mapping

1.1. 简介

mapping：意为映射关系，特别是指组织结构。在此语境中可理解为数据结构，包括表结构，表约束，数据类型等。（非母语环境伤不起。。。晦涩无比，半小时才转过圈来）

1.2. mapping type

每个索引都有一个映射类型，它决定文档索引的方式。

映射类型分为两种：

元字段：_index,_type,_id,_source
值字段或属性：

值字段数据类型—相当于mysql的数据类型

有text,keywork,date,boolean,object,nested,geo_point等

具体见其它文档。

1.3. 映射约束

在一个索引中定义太多的字段可能会导致内存溢出，它并不像想象的那么少见。

有一些设置用来约束

index.mapping.total_fields.limit

索引字段数量，计数包括字段，对象映射，字段别名。默认值1000

index.mapping.depth.limit

字段最大深度，指对象引用的深度。默认为20.

index.mapping.nested_fields.limit

The maximum number of distinct nested mappings in an index, defaults to 50.

非重复的嵌套映射数量，默认50

index.mapping.nested_objects.limit

单一文档嵌套json对象的最大值，默认10000

index.mapping.field_name_length.limit

字段名的长度限制，默认无限制。

1.4. 动态映射

字段和映射类型无需提前定义。添加时会自动创建。

在顶层映射、内部对象及nested字段上都会如此。

1.5. 显示映射explicit mapping

设置命令语法：

PUT /<index>/_mapping

创建index并指定映射

curl -X PUT "localhost:9200/my-index?pretty" -H ‘Content-Type: application/json‘ -d‘

"mappings":

"properties":

"age": "type": "integer" ,

"email": "type": "keyword" ,

"name": "type": "text"

‘

为一个映射添加字段

案例：

curl -X PUT "localhost:9200/my-index/_mapping?pretty" -H ‘Content-Type: application/json‘ -d‘

"properties":

"employee-id": # 字段名

"type": "keyword", # 字段类型

"index": false # 代表此字段不参与index

注意：

已存在的映射是不能修改的，下述项例外：

为object字段添加属性
ignore_above的值是可以改的。

修改已存在的映射会使已有索引数据失效。如果想修改映组织关系，创建新的index并reindex数据。如果只是想修改字段名，建议添加别名字段。

1.6. 相关命令

查看索引的映射

GET /my-index/_mapping

rv = es.indices.get_mapping(index_name)

1.7. 注意事项

_doc问题

7.X及以后版本并没有type参数，但在7.x中部分命令的type位置需要写成_doc。

2. field datatypes

Elasticsearch supports a number of different datatypes for the fields in a document:

-----Core datatypes

2.1. string

text and keyword

2.1.1. text

它会被解析为individual terms before being indexed.

2.1.2. keyword

用于索引结构化内容的字段，例如email addresses, hostnames, status code.

案例

PUT my_index

"mappings":

"properties":

"tags":

"type": "keyword"

2.2. Numeric

long, integer, short, byte, double, float, half_float, scaled_float

2.3. Date

date

2.4. Date nanoseconds

date_nanos

2.5. Boolean

boolean

2.6. Binary

binary

2.7. Range

integer_range, float_range, long_range, double_range, date_range

------Complex datatypes

2.8. Object

object for single JSON objects

2.9. Nested

nested for arrays of JSON objects

3. meta-field元字段

每个文档都有自己的元字段。

3.1. identity meta-fields

_index

_type

_id

3.2. document source meta-fields

_source：源JSON数据

_size：_source的大小，单位bytes，provided by the mapper-size plugin.

3.3. indexing meta-fields

_fields_names

_igonred

3.4. routing meta-field

_routing

3.5. other meta-field

_meta

4. analyzer

document: https://www.elastic.co/guide/en/elasticsearch/reference/current/analyzer.html

下面是一个设置索引分词器参数及应用的案例：

PUT my_index

"settings":

"analysis":

"analyzer":

"my_analyzer":

"type":"custom",

"tokenizer":"standard",

"filter":[

"lowercase"

]

"my_stop_analyzer":

"type":"custom",

"tokenizer":"standard",

"filter":[

"lowercase",

"english_stop"

]

"filter":

"english_stop":

"type":"stop",

"stopwords":"_english_"

"mappings":

"properties":

"title":

"type":"text",

"analyzer":"my_analyzer",

"search_analyzer":"my_stop_analyzer",

"search_quote_analyzer":"my_analyzer"

PUT my_index/_doc/1

"title":"The Quick Brown Fox"

PUT my_index/_doc/2

"title":"A Quick Brown Fox"

GET my_index/_search

"query":

"query_string":

"query":"\"the quick brown fox\""

以上是关于elasticsearch数据组织结构的主要内容，如果未能解决你的问题，请参考以下文章