如何将 DataFrame 中的每一行/单元格值转换为 pandas 中的字典列表?
Posted
技术标签:
【中文标题】如何将 DataFrame 中的每一行/单元格值转换为 pandas 中的字典列表?【英文标题】:How to convert each row/cell values from a DataFrame to a list of dictionaries in pandas? 【发布时间】:2022-01-19 02:04:14 【问题描述】:我在下面有一个熊猫数据框:
df_input = pd.DataFrame(
'Domain':['www.google.com','www.apple.com','www.amazon.com'],
'Description':['The company’s product portfolio includes Googl...','Apple is a multinational corporation that desi...','Amazon is an international e-commerce website ...'],
'About Us':['Google is a multinational corporation that spe...','Apple is a multinational corporation that desi...','Amazon is an e-commerce website for consumers,...'],
'Founded':[1998, 1976, 1994],
'Country':['United States','United States','United States'])
我想按如下方式传输:
["properties": ["name": "Description","value": "The company’s product portfolio includes Google Search, which provides users with access to information online; Knowledge Graph that allows users to search for things, people, or places as well as builds systems recognizing speech and understanding",
"name": "Domain","value": "www.google.com",
"name": "About Us", "value": "Google is a multinational corporation that specializes in Internet-related services and products.",
"name": "Founded", "value": 1998,
"name": "Country", "value":"United States"],
"properties": ["name": "Description","value": "Apple is a multinational corporation that designs, manufactures, and markets mobile communication and media devices, personal computers, portable digital music players, and sells a variety of related software, services, peripherals, networking solutions, and third-party digital content and applications.",
"name": "Domain","value": "www.apple.com",
"name": "About Us", "value": "Apple is a multinational corporation that designs, manufactures, and markets consumer electronics, personal computers, and software.",
"name": "Founded", "value": 1976,
"name": "Country", "value":"United States"],
"properties": ["name": "Description","value": "Amazon is an international e-commerce website for consumers, sellers, and content creators. It offers users merchandise and content purchased for resale from vendors and those offered by third-party sellers.",
"name": "Domain","value": "www.amazon.com",
"name": "About Us", "value": "Amazon is an e-commerce website for consumers, sellers, and content creators.",
"name": "Founded", "value": 1994,
"name": "Country", "value":"United States"]]
如何编写循环来执行此操作?
【问题讨论】:
【参考方案1】:您只需遍历行。
import pandas as pd
df_input = pd.DataFrame(
'Domain':['www.google.com','www.apple.com','www.amazon.com'],
'Description':['The companys product portfolio includes Googl...','Apple is a multinational corporation that desi...','Amazon is an international e-commerce website ...'],
'About Us':['Google is a multinational corporation that spe...','Apple is a multinational corporation that desi...','Amazon is an e-commerce website for consumers,...'],
'Founded':[1998, 1976, 1994],
'Country':['United States','United States','United States'])
data = []
for row in df_input.iterrows():
props = []
for key,val in zip(df_input.columns, row[1].values ):
props.append( 'name':key, 'value':val )
data.append( 'properties': props )
from pprint import pprint
pprint(data)
输出:
['properties': ['name': 'Domain', 'value': 'www.google.com',
'name': 'Description',
'value': 'The companys product portfolio includes Googl...',
'name': 'About Us',
'value': 'Google is a multinational corporation that spe...',
'name': 'Founded', 'value': 1998,
'name': 'Country', 'value': 'United States'],
'properties': ['name': 'Domain', 'value': 'www.apple.com',
'name': 'Description',
'value': 'Apple is a multinational corporation that desi...',
'name': 'About Us',
'value': 'Apple is a multinational corporation that desi...',
'name': 'Founded', 'value': 1976,
'name': 'Country', 'value': 'United States'],
'properties': ['name': 'Domain', 'value': 'www.amazon.com',
'name': 'Description',
'value': 'Amazon is an international e-commerce website ...',
'name': 'About Us',
'value': 'Amazon is an e-commerce website for consumers,...',
'name': 'Founded', 'value': 1994,
'name': 'Country', 'value': 'United States']]
【讨论】:
谢谢蒂姆!这正是我所希望的。很高兴知道 iterrows() 可以解决问题。 我确实需要给你一个强制性警告,即遍历 pandas 中的行或列很慢。如果可能,最好找到批量操作。在这种情况下,我认为没有其他选择。以上是关于如何将 DataFrame 中的每一行/单元格值转换为 pandas 中的字典列表?的主要内容,如果未能解决你的问题,请参考以下文章
根据 Dataframe 中的单元格值用多个箭头注释绘图烛台图
如何使用 Pandas 在 Python 中基于同一行中的另一个单元格设置单元格值
在 pyspark 的 StructStreaming 中;如何将 DataFrame 中的每一行(json 格式的字符串)转换为多列
Pandas Dataframe - 根据正则表达式条件替换所有单元格值