将 Python Pandas 数据框转换为 JSon 格式并通过使用 Python 添加其列名保存到 MongoDB 数据库中
Posted
技术标签:
【中文标题】将 Python Pandas 数据框转换为 JSon 格式并通过使用 Python 添加其列名保存到 MongoDB 数据库中【英文标题】:Convert Python Pandas Data Frame into JSon format and saved into MongoDB database by adding its column name using Python 【发布时间】:2017-10-24 08:48:10 【问题描述】:在 Json 中转换 DataFrame,添加列名,如所需的输出技能和建议中所示,然后保存在 MongoDB 集合中
Python Pandas DataFrame 作为输入
0 1 2 3 4 5 6 7
java hadoop java hdfs c c++ php python html
c c c++ hdfs python hadoop java php html
c++ c++ c python hdfs hadoop java php html
hadoop hadoop java hdfs c c++ php python html
hdfs hdfs hadoop java c c++ python php html
python python c++ html c php hdfs hadoop java
将所需输出保存到 MongoDB 集合中
"_id" : ObjectId("5922a781205a763b55e2e90e"), "skill" : "java", "suggestions" : [ "hadoop", "java", "hdfs", "c", "c++", "php ", "python", "html" ]
"_id" : ObjectId("5922a781205a763b55e2e91e"), "skill" : "c", "suggestions" : [ "c", "c++", "hdfs", "python", "hadoop", "java ", "php", "html" ]
"_id" : ObjectId("5922a781205a763b55e2e92e"), "skill" : "c++", "suggestions" : [ "c++", "c", "python", "hdfs", "hadoop", "java ", "php", "html" ]
"_id" : ObjectId("5922a781205a763b55e2e93e"), "skill" : "hadoop", "suggestions" : [ "hadoop", "java", "hdfs", "c", "c++", "php ", "python", "html" ]
【问题讨论】:
【参考方案1】:首先需要将数据翻译成相应的格式。
strlist = [['java','hadoop','java','hdfs','c','c++','php','python','html'],
['c','c','c++','hdfs','python','hadoop','java','php','html'],
['c++','c++','c','python','hdfs','hadoop','java','php','html'],
['hadoop','hadoop','java','hdfs','c','c++','php','python','html'],
['hdfs','hdfs','hadoop','java','c','c++','python','php','html'],
['python','python','c++','html','c','php','hdfs','hadoop','java']]
df = pd.DataFrame(strlist)
#I guess you need the following code
df['skill']=df[df.columns[:1]].values
df['suggestions'] = df[df.columns[1:]].values.tolist()
df = df[['skill','suggestions']]
print(df)
skill suggestions
0 java [hadoop, java, hdfs, c, c++, php, python, html...
1 c [c, c++, hdfs, python, hadoop, java, php, html...
2 c++ [c++, c, python, hdfs, hadoop, java, php, html...
3 hadoop [hadoop, java, hdfs, c, c++, php, python, html...
4 hdfs [hdfs, hadoop, java, c, c++, python, php, html...
5 python [python, c++, html, c, php, hdfs, hadoop, java...
然后将dataframe插入到mongdb数据库中。
records = json.loads(df.T.to_json()).values()
collection.insert_many(records)
【讨论】:
以上是关于将 Python Pandas 数据框转换为 JSon 格式并通过使用 Python 添加其列名保存到 MongoDB 数据库中的主要内容,如果未能解决你的问题,请参考以下文章
如何将 pandas 数据框列转换为本机 python 数据类型?