Python 和 JSON - 附加到字符串对象

Posted

技术标签:

【中文标题】Python 和 JSON - 附加到字符串对象【英文标题】:Python & JSON - Appending to String Object 【发布时间】:2021-09-26 01:35:32 【问题描述】:

使用 JSON 和 Python 相对较新,但在尝试附加到现有 JSON 对象时遇到问题。首先,一些伪代码:

    想为每个追加使用一个 JSON“模板”,以便创建 新列表中的第一项将始终是此模板(索引 0) 找到新记录后,模板会附加到对象中,然后填充

我已经尝试过 json.update 和 dict.append 并且只是连接到一个字符串,但我总是让列表索引超出范围(当前错误,下面的代码)或附加错误。

请帮忙!

def parse(self, response): 
    # Create Store JSON template
    unit_JSON_template = 
        "Store": [
        
            "ID": "",
            "Seller": 
            
                "Name": ""
            ,
            "Detail": 
            
                "StoreURL": "",
                "Title": "",
                "Stock": "",
                "Other": "", 
                "Images": [ 
                
                    "Url": "",
                    "Encode": "",
                    "Title": ""
                
                ],
                "Request": 
                
                    "DateTime": "",
                    "RequestHeaders": "",
                    "ResponseHeaders": ""
                
            
        
        ],
    

    # Convert template string to JSON
    unit_JSON_str = json.dumps(unit_JSON_template, indent = 4, separators = (", ", ": "), sort_keys = False)
    print(unit_JSON_str)
    unit_JSON_obj = json.loads(unit_JSON_str)
    unit_JSON = unit_JSON_obj

    # Create identifying information 
    record = response.url.split("/")[2] + "-" + response.url.split("/")[-2]
    record_timestamp = datetime.now().strftime("%m%d%Y%-H%M%S")
    page_filename = f'record-record_timestamp.html'
    screenshot_filename = f'record-record_timestamp.png'

    # Parse data, load to JSON object for Insert to SQL
    data_units = response.xpath("//candy-stores")
    print("Units Found: " + str(len(data_units)))
    
    # Loop over each object and insert into JSON object (index 0 always template above)
    for i, data_unit in enumerate(data_units):
        i_1 = i + 1 #do this since template is always index 0
        unit_JSON.update(unit_JSON_obj)
        print(json.dumps(unit_JSON, indent=4))

        unit_JSON['Store'][i_1]['Detail']['Title'] = "Store Name"
        unit_JSON['Store'][i_1]['Detail']['StoreURL'] = "Unit"
        unit_JSON['Store'][i_1]['Detail']['Request']['DateTime'] = "12:00pm"
        unit_JSON['Store'][i_1]['Detail']['Other'] = "Additional Data"
        unit_JSON['Store'][i_1]['Seller']['Name'] = "ABC Candy"

【问题讨论】:

你能添加一个打印出来的data_units吗? 【参考方案1】:

你太复杂了。在您的 python 代码中,不要使用 JSON,而是使用字典列表。当您需要提供上述 JSON 时,转换为 JSON。

看看我将如何处理这个问题。一个简单的辅助函数,用于将新的 dict 项添加到 dict 项列表中。 我已经包含了一些关于打印、分配 JSON 字符串和保存 JSON 文件的示例。

你需要处理not_a_json_object中的images,我没有包含这部分。

至于IndexError,你这样做i_1 = i + 1,然后尝试根据它访问列表中的元素。

import json
from pprint import pprint


def new_dict_from_template(
    storeid="",
    sellername="",
    storeurl="",
    title="",
    stock="",
    other="",
    images=[],
    req_datetime="",  # named to prevent name collision
    requestheaders="",
    responseheaders="",
) -> dict:
    return 
        "Store": [
            
                "ID": storeid,
                "Seller": "Name": sellername,
                "Detail": 
                    "StoreURL": storeurl,
                    "Title": title,
                    "Stock": stock,
                    "Other": other,
                    "Images": images,
                    "Request": 
                        "DateTime": req_datetime,
                        "RequestHeaders": requestheaders,
                        "ResponseHeaders": responseheaders,
                    ,
                ,
            
        ],
    


def parse(self, response):
    # Create identifying information
    record = response.url.split("/")[2] + "-" + response.url.split("/")[-2]
    record_timestamp = datetime.now().strftime("%m%d%Y%-H%M%S")
    page_filename = f"record-record_timestamp.html"
    screenshot_filename = f"record-record_timestamp.png"

    # Parse data, load to JSON object for Insert to SQL
    data_units = response.xpath("//candy-stores")
    print("Units Found: " + str(len(data_units)))

    # create a list
    not_a_json_object = []

    # Loop over each object and append to list
    for i, data_unit in enumerate(data_units):
        # Indexing is not really needed unless you use it in the JSON
        # you need to change the method call below
        # to match your data_unit
        not_a_json_object.append(
            new_dict_from_template(
                id="no idea",
                sellername="ABC Candy",
                other="Additional Data",
                req_datetime="12:00pm",
            )
        )


# note the [ and ] to make this a list of dicts
not_a_json_object = [new_dict_from_template()]
print(json.dumps(not_a_json_object, indent=4))
print("----------------------------------------")
not_a_json_object.append(
    new_dict_from_template(
        storeid="no idea",
        sellername="ABC Candy",
        other="Additional Data",
        req_datetime="12:00pm",
    )
)
print(json.dumps(not_a_json_object, indent=4))
print("----------------------------------------")
# just taking one element of the list is still a valid JSON
json_str = json.dumps(not_a_json_object[1], indent=4) 
print(json_str)

with open("somejsonfile.json", 'w') as outfile:
    outfile.write(json.dumps(not_a_json_object, indent=4))

【讨论】:

以上是关于Python 和 JSON - 附加到字符串对象的主要内容,如果未能解决你的问题,请参考以下文章

python追加到json对象中的数组

serde_json to json 在柴油数据库对象中打印附加字符串 \r 和 \n

如何将部分字符数组附加到字符串?

如何将对象从 JSON 附加到 HTML 中的 div?

python接口自动化:响应内容中json字符串对象的处理

将 pandas 数据帧转换为 json 对象 - pandas