弹性搜索:索引具有空值的日期字段
Posted
技术标签:
【中文标题】弹性搜索:索引具有空值的日期字段【英文标题】:Elastic Search: indexing dates field that has null values 【发布时间】:2018-10-29 06:39:53 【问题描述】:我在 Python 中使用 Elasticsearch 客户端为以下字段创建索引,但我一直坚持创建具有空值的日期索引。
当数据中存在空值时,我很难理解为什么它没有设置为date
而不是string
的索引。
从在线和 ES 文档研究来看,您似乎无法对空值进行索引。
所以,我正在关注这个https://www.elastic.co/guide/en/elasticsearch/reference/current/null-value.html
文档来解决使用"null_value": "NULL"
的问题,但是我没有成功。
我尝试将实际日期日期更改为"yyyy-MM-dd", "MM/dd/yyyy"
...等格式以及许多其他组合。
对于 json 映射,我也尝试过 "type": "strict_date"
和 "type": "strict_date": "MM/dd/yyyy"
。
有什么办法可以解决这个问题吗?
数据:
id_name,team_name,team_members,date_info,date_sub
123,"Biology, Neurobiology ","Ali Smith, Jon Doe",5/1/2015,5/1/2015
234,Mathematics,Jane Smith ,8/12/2016,
345,"Statistics, Probability","Matt P, Albert Shaw",5/15/2015,5/15/2015
456,Chemistry,"Andrew M, Matt Shaw, Ali Smith",4/12/2017,
678,Physics,"Joe Doe, Jane Smith, Ali Smith ",5/12/2017,5/12/2017
JSON/Python 映射:
request_body = '''
"settings" :
"number_of_shards": 2,
"number_of_replicas": 1
,
"mappings":
"team":
"properties":
"id_name": "type": "text",
"team_name": "type": "text",
"team_members": "type": "text",
"date_info": "type": "date","null_value": "NULL",
"date_sub": "type": "date","null_value":"NULL"
'''
res = self.es.indices.create(index=your_index_name, ignore = 400, body=request_body)
错误:
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: TransportError(400, 'mapper_parsing_exception', 'failed to parse [date_info]')
【问题讨论】:
您可以发布您的索引请求吗? 【参考方案1】:在您的映射中,您没有为您的日期字段指定日期格式,在这种情况下,Elastic 将使用内置格式,如下 - "strict_date_optional_time||epoch_millis"
,这意味着,它应该是一个表示毫秒的长数字纪元的开头或strict_date_optional_time
,实际上是一种strict格式
严格格式意味着,如果您有日期5/12/2017
,则应将其填充到缺少的数字。在这种情况下,正确的严格日期应该是05/12/2017
有关日期格式的更多信息 - https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html#built-in-date-formats
【讨论】:
【参考方案2】:首先,您的日期字段架构不得包含"null_value": "NULL"
。
我在 Kibana 中试过
PUT ***
"settings":
"number_of_shards": 2,
"number_of_replicas": 1
,
"mappings":
"team":
"properties":
"id_name":
"type": "text"
,
"team_name":
"type": "text"
,
"team_members":
"type": "text"
,
"date_info":
"type": "date"
,
"date_sub":
"type": "date"
然后,我尝试使用空日期信息插入数据
POST ***/team
"id_name": 341,
"team_name": "Gogologi",
"team_members": "Wayern",
"date_info": null,
"date_sub": "2014-02-01"
为了验证,我执行了 GET 命令GET ***/team/_search
"_index": "***",
"_type": "team",
"_id": "AWOCTEhoVu_LbUvfNt6J",
"_score": 1,
"_source":
"id_name": 341,
"team_name": "Gogologi",
"team_members": "Wayern",
"date_info": null,
"date_sub": "2014-02-01"
希望对你有帮助!
【讨论】:
【参考方案3】:null_value
需要与字段具有相同的数据类型。 null_value | Elastic
我将null_value
设置为可以被指定的format
解析的值。
PUT my-index-000001
"mappings":
"properties":
"date":
"type": "date",
"null_value": "01/01/0001",
"format": "dd/MM/yyyy"
然后,我们可以插入一些文档。
POST my-index-000001/_doc
"date": null
POST my-index-000001/_doc
"date": "01/01/0001"
POST my-index-000001/_doc
"date": "31/10/2021"
现在,我们可以搜索null_value
。
GET my-index-000001/_search
"query":
"match":
"date": "01/01/0001"
### Response ###
"took" : 0,
"timed_out" : false,
"_shards" :
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
,
"hits" :
"total" :
"value" : 2,
"relation" : "eq"
,
"max_score" : 1.0,
"hits" : [
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "rY203nwBSf_8E_MJ7pyJ",
"_score" : 1.0,
"_source" :
"date" : null
,
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "ro203nwBSf_8E_MJ9Jzy",
"_score" : 1.0,
"_source" :
"date" : "01/01/0001"
]
但请注意,null_value
仍然可以使用 range
查询进行搜索。
GET my-index-000001/_search
"query":
"range":
"date":
"lt": "01/01/2021"
### Response ###
"took" : 0,
"timed_out" : false,
"_shards" :
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
,
"hits" :
"total" :
"value" : 2,
"relation" : "eq"
,
"max_score" : 1.0,
"hits" : [
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "rY203nwBSf_8E_MJ7pyJ",
"_score" : 1.0,
"_source" :
"date" : null
,
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "ro203nwBSf_8E_MJ9Jzy",
"_score" : 1.0,
"_source" :
"date" : "01/01/0001"
]
【讨论】:
以上是关于弹性搜索:索引具有空值的日期字段的主要内容,如果未能解决你的问题,请参考以下文章