Elasticsearch学习4-数据修改

Posted Huazie

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Elasticsearch学习4-数据修改相关的知识,希望对你有一定的参考价值。

数据修改

原文请查看《Modifying Your Data

Elasticsearch 提供近乎实时的数据操作和搜索功能。默认情况下,从索引/更新/删除数据到数据出现在搜索结果中,你可以预估一秒钟的延迟(刷新间隔)。这是与其他平台(如SQL)的一个重要区别,后者在事务完成后其中的数据就立即可用。

Elasticsearch provides data manipulation and search capabilities in near real time. By default, you can expect a one second delay (refresh interval) from the time you index/update/delete your data until the time that it appears in your search results. This is an important distinction from other platforms like SQL wherein data is immediately available after a transaction is completed.

索引/替换文档

我们先前已经见过如何索引单个文档。让我们再次回忆一下该命令:

We’ve previously seen how we can index a single document. Let’s recall that command again:

PUT /customer/doc/1?pretty

  "name": "John Doe"

得到(如下)响应:

同样,上面的命令将把指定的文档索引到客户索引中,ID 为1。如果我们再次使用不同(或相同)的文档执行上述命令,Elasticsearch 将在 ID 为1的现有文档之上替换(即重新索引)一个新文档:

Again, the above will index the specified document into the customer index, with the ID of 1. If we then executed the above command again with a different (or same) document, Elasticsearch will replace (i.e. reindex) a new document on top of the existing one with the ID of 1:

PUT /customer/doc/1?pretty

  "name": "Jane Doe"

得到(如下)响应:

上面将 ID1 的文档的名称从 “John Doe” 更改为 “Jane Doe”。另一方面,如果我们使用不同的 ID,则将对新文档进行索引,而索引中现有的文档将保持不变。

The above changes the name of the document with the ID of 1 from “John Doe” to “Jane Doe”. If, on the other hand, we use a different ID, a new document will be indexed and the existing document(s) already in the index remains untouched.

PUT /customer/doc/2?pretty

  "name": "Jane Doe"

得到(如下)响应:

以上命令对 ID 为2的新文档进行索引。

The above indexes a new document with an ID of 2.

索引时,ID部分是可选的。如果未指定,Elasticsearch将生成一个随机ID,然后使用它来索引文档。Elasticsearch 生成的实际ID(或我们在前面的示例中明确指定的任何ID)作为索引API调用的一部分返回。

When indexing, the ID part is optional. If not specified, Elasticsearch will generate a random ID and then use it to index the document. The actual ID Elasticsearch generates (or whatever we specified explicitly in the previous examples) is returned as part of the index API call.

(下面)这示例展示了如何在没有明确ID的情况下索引文档:

This example shows how to index a document without an explicit ID:

POST /customer/doc?pretty

  "name": "Jane Doe"

得到(如下)响应:

注意,在上面的案例中,我们使用 POST 动词而不是 PUT,因为我们没有指定 ID

Note that in the above case, we are using the POST verb instead of PUT since we didn’t specify an ID.

更新文档

除了能够索引和替换文档之外,我们还可以更新文档。请注意,Elasticsearch 实际上并没有在后台就地进行更新。每当我们进行更新时,Elasticsearch 会删除旧文档,然后对新文档进行索引,并一次性应用更新。

In addition to being able to index and replace documents, we can also update documents. Note though that Elasticsearch does not actually do in-place updates under the hood. Whenever we do an update, Elasticsearch deletes the old document and then indexes a new document with the update applied to it in one shot.

(如下)这示例展示了如何通过将 name 字段修改为 “Jane Doe” 来更新先前的文档(ID1):

This example shows how to update our previous document (ID of 1) by changing the name field to “Jane Doe”:

POST /customer/doc/1/_update?pretty

  "doc":  "name": "Jane Doe" 

得到(如下)响应:


(如下)这示例展示了如何通过将 name 字段更改为 “Jane Doe” 来更新上一个文档(ID1),同时向其中添加年龄字段:

This example shows how to update our previous document (ID of 1) by changing the name field to “Jane Doe” and at the same time add an age field to it:

POST /customer/doc/1/_update?pretty

  "doc":  "name": "Jane Doe", "age": 20 

得到(如下)响应:

(当然)也可以使用简单的脚本执行更新。(如下)这示例使用脚本将年龄增加5:

Updates can also be performed by using simple scripts. This example uses a script to increment the age by 5:

POST /customer/doc/1/_update?pretty

  "script" : "ctx._source.age += 5"

得到(如下)响应:

在上面的示例中,ctx._source 是指将要更新的当前源文档。

In the above example, ctx._source refers to the current source document that is about to be updated.

Elasticsearch 提供了在给定查询条件(如 SQL UPDATE-WHERE 语句)下更新多个文档的能力。查看 docs-update-by-query API

Elasticsearch provides the ability to update multiple documents given a query condition (like an SQL UPDATE-WHERE statement). See docs-update-by-query API

删除文档

删除文档相当简单。(下面)这示例展示了如何删除前面的 ID2 的客户(文档):

Deleting a document is fairly straightforward. This example shows how to delete our previous customer with the ID of 2:

DELETE /customer/doc/2?pretty

得到(如下)响应:

请参阅 _delete_by_query API 来删除与特定查询匹配的所有文档。值得注意的是,使用 Delete By Query API 删除整个索引比删除所有文档更有效。

See the _delete_by_query API to delete all documents matching a specific query. It is worth noting that it is much more efficient to delete a whole index instead of deleting all documents with the Delete By Query API.

批处理

除了能够 索引更新删除 单个文档之外,Elasticsearch 还提供了使用_bulk API 来批量执行上述任何操作的能力。该功能非常重要,因为它提供了一种非常有效的机制,以尽可能少的网络传输,来尽可能快地执行多个操作。

In addition to being able to index, update, and delete individual documents, Elasticsearch also provides the ability to perform any of the above operations in batches using the _bulk API. This functionality is important in that it provides a very efficient mechanism to do multiple operations as fast as possible with as few network roundtrips as possible.

作为一个快速示例,以下调用在一次批量操作中为两个文档(ID 1- John Doe和 ID 2 - Jane Doe)进行索引:

As a quick example, the following call indexes two documents (ID 1 - John Doe and ID 2 - Jane Doe) in one bulk operation:

POST /customer/doc/_bulk?pretty
"index":"_id":"1"
"name": "John Doe" 
"index":"_id":"2"
"name": "Jane Doe" 

得到(如下)响应:

(而下面)这个示例在一次批量操作中更新第一个文档(ID 为1),然后删除第二个文档(ID2):

This example updates the first document (ID of 1) and then deletes the second document (ID of 2) in one bulk operation:

POST /customer/doc/_bulk?pretty
"update":"_id":"1"
"doc":  "name": "John Doe becomes Jane Doe"  
"delete":"_id":"2"

得到(如下)响应:

请注意,对于删除操作,后面没有相应的源文档,因为删除只需要删除文档的 ID

Note above that for the delete action, there is no corresponding source document after it since deletes only require the ID of the document to be deleted.

批量 API 不会因其中一个操作失败而失败。如果单个操作因任何原因失败,它将继续处理其后的剩余操作。当批量 API 返回时,它将为每个操作提供状态(以它被发送的相同顺序),以便你检查特定操作是否失败。

The Bulk API does not fail due to failures in one of the actions. If a single action fails for whatever reason, it will continue to process the remainder of the actions after it. When the bulk API returns, it will provide a status for each action (in the same order it was sent in) so that you can check if a specific action failed or not.

以上是关于Elasticsearch学习4-数据修改的主要内容,如果未能解决你的问题,请参考以下文章

elk+redis+filebeat

Elasticsearch学习4-数据修改

〈二〉ElasticSearch的认识:索引类型文档

通过Filebeat把日志传入到Elasticsearch

Elasticsearch使用指南之初始环境搭建

MYSQL千万级别数据量迁移Elasticsearch5.6.1实战