更新 Elasticsearch _mapping 中的字符串参数
Posted
技术标签:
【中文标题】更新 Elasticsearch _mapping 中的字符串参数【英文标题】:Update a string parameter in Elasticsearch _mapping 【发布时间】:2021-02-23 19:41:13 【问题描述】:我在Elasticsearch
6.8
中有这样一个_mapping
:
"grch38_test__wes__grch38__variants__20210222" :
"mappings" :
"variant" :
"_meta" :
"gencodeVersion" : "25",
"hail_version" : "0.2.20",
"genomeVersion" : "38",
"sampleType" : "WES",
"sourceFilePath" : "s3://my_folder/my_vcf.vcf"
,
...
我的目标是在Kibana
中发出查询以修改variant._meta.sourceFilePath
。跟帖:
Elastic search mapping for nested json objects
我能够提出查询:
PUT /grch38_test__wes__grch38__variants__20210222/_mapping/variant
"properties":
"variant":
"type": "nested",
"properties":
"_meta":
"type": "nested",
"properties":
"type": "text",
"sourceFilePath": "s3://my_folder/my_vcf.vcf"
但它给了我一个错误:
elasticsearch mapping Expected map for property [fields] on field [name] but got a class java.lang.String
完整的错误信息:
"error":
"root_cause": [
"type": "mapper_parsing_exception",
"reason": "Expected map for property [fields] on field [type] but got a class java.lang.String"
],
"type": "mapper_parsing_exception",
"reason": "Expected map for property [fields] on field [type] but got a class java.lang.String"
,
"status": 400
我也试过了:
PUT /grch38_test__wes__grch38__variants__20210222/_mapping/variant
"properties":
"variant":
"type": "nested",
"properties":
"_meta":
"type": "nested",
"properties":
"sourceFilePath":
"type": "text",
"value":"s3://my_folder/my_vcf.vcf"
但它告诉我 value
不受支持:
"error":
"root_cause": [
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [sourceFilePath] has unsupported parameters: [value : s3://seqr-dp-data--prod/vcf/dev/grch38_test_contracted.vcf]"
],
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [sourceFilePath] has unsupported parameters: [value : s3://seqr-dp-data--prod/vcf/dev/grch38_test_contracted.vcf]"
,
"status": 400
我做错了什么?如何修改字段?
【问题讨论】:
【参考方案1】:_meta
是storing application-specific metadata 的保留字段。它不是可搜索的,只能通过GET Mapping API 检索。
这意味着,如果您的 _meta
内容旨在与 _meta
字段的设计用途一致,则您不能对其应用任何映射。它是具体值的“最终”哈希图,需要在更新映射负载的顶层定义:
PUT /grch38_test__wes__grch38__variants__20210222/_mapping/variant
"_meta":
"variant": <-- shared index-level metadata
"gencodeVersion": "25",
"hail_version": "0.2.20",
"genomeVersion": "38",
"sampleType": "WES",
"sourceFilePath": "s3://my_folder/my_vcf.vcf"
,
"properties":
"some_text_field": <-- actual document properties
"type": "text"
另一方面,如果您的 _meta
字段是一个不幸的命名巧合,您可以像这样声明它的映射:
PUT /grch38_test__wes__grch38__variants__20210222/_mapping/variant
"properties":
"_meta":
"properties":
"variant":
"properties":
"gencodeVersion":
"type": "text"
,
"genomeVersion":
"type": "text"
,
"hail_version":
"type": "text"
,
"sampleType":
"type": "text"
,
"sourceFilePath":
"type": "text"
并摄取表单的文档:
POST grch38_test__wes__grch38__variants__20210222/variant/_doc
"_meta":
"variant":
"gencodeVersion": "25",
"hail_version": "0.2.20",
"genomeVersion": "38",
"sampleType": "WES",
"sourceFilePath": "s3://my_folder/my_vcf.vcf"
但同样,_meta
内容将是文档特定的,而不是索引范围的!
顺便说一句,nested
映射仅在您处理arrays of objects 时才有意义,而不是对象的对象。
但如果你坚持想要它,你会这样做:
PUT /grch38_test__wes__grch38__variants__20210222/_mapping/variant?include_type_name
"properties":
"_meta":
"type": "nested", <---
"properties":
"variant":
"type": "nested", <---
"properties":
"gencodeVersion":
"type": "text"
,
"genomeVersion":
"type": "text"
,
"hail_version":
"type": "text"
,
"sampleType":
"type": "text"
,
"sourceFilePath":
"type": "text"
【讨论】:
那么,这是否意味着(第一种情况,_meta
对应于它应该是什么)sourceFilePath
是在创建索引时定义的,并且根本不允许对其进行修改并且存在没有办法吗?
否 -- 您可以使用我的第一个 sn-p 修改共享的 _meta
属性 -- 只需确保删除 properties
部分。以上是关于更新 Elasticsearch _mapping 中的字符串参数的主要内容,如果未能解决你的问题,请参考以下文章
ElasticSearch实战(十四)-Mappings 高级属性