python Elasticsearch bulkデータ生成

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python Elasticsearch bulkデータ生成相关的知识,希望对你有一定的参考价值。

# coding:utf-8

import codecs
import json
import uuid
import random
import collections as cl
 
def main():
    firstname = [
        "adam",
        "beth",
        "curl",
        "denym",
        "elizabeth",
        "flora",
        "genome",
        "hyne",
        "iila",
        "john",
        "kate",
        "lamberd",
        "mike"
    ]
    lastname = [
        "aliandra",
        "bonum",
        "cata",
        "delis",
        "endo",
        "fobus",
        "goldbaum",
        "hinelich",
        "ilunums",
        "joshua",
        "kasperski",
        "lindbirg",
        "mizur"
    ]
    gender = [
        "M", "F"
    ]

    fw = codecs.open('./test.json', 'w', 'utf-8')

    com = cl.OrderedDict()
    ys = cl.OrderedDict()
    for i in range(50):
        predata = cl.OrderedDict()
        predata["_index"] = "TestIndex"
        predata["_type"] = "testType"
        predata["_id"] = "123"
        com["index"] = predata
        # json.dump(com, fw)

        data = cl.OrderedDict()
        data["account_number"] = str(uuid.uuid4())
        data["firstname"] = firstname[random.randint(0, len(firstname)-1)]
        data["lastname"] = lastname[random.randint(0, len(lastname)-1)]
        data["age"] = random.randint(15, 100)
        data["gender"] = gender[random.randint(0, len(gender)-1)]
        
        fw.write("{}".format(json.dumps(com)) + '\n' + "{}".format(json.dumps(data)) + '\n')

 
if __name__=='__main__':
    main()

以上是关于python Elasticsearch bulkデータ生成的主要内容,如果未能解决你的问题,请参考以下文章

Elasticsearch:使用 Python 进行 Bulk insert 及 Scan

python中的Elasticsearch parallel_bulk辅助函数在使用响应时会因部分失败而引发错误

四十二 Python分布式爬虫打造搜索引擎Scrapy精讲—elasticsearch(搜索引擎)的mget和bulk批量操作

如何使用 elasticsearch.helpers.streaming_bulk

py-elasticsearch的stream_bulk、parallel_bulk、bulk性能对比

ElasticSearch 7.3采用restful风格 批量(bulk)增删改