数据框列值映射
Posted
技术标签:
【中文标题】数据框列值映射【英文标题】:Dataframe column value mapping 【发布时间】:2021-09-29 03:12:23 【问题描述】:我是 python 和 pandas 的新手,想使用尽可能多的 pandas 内置功能。
data = 'source': ['Iowa','New York','San Jose','Houston','Houston' ],
'target' :['New York', 'San Jose', 'Iowa', 'San Jose', 'Arizona']
print(np.arange(10).reshape((10,1)) )
data = [['Iowa', 'New York', 1], ['New York' ,'San Jose', 1], ['San Jose' ,'Iowa', 1], ['Houston', 'San Jose', 1], ['Houston' ,'Arizona', 1]]
dataDf = pd.DataFrame(data, columns = ['Source', 'Target', 'value'])
print(dataDf)
# I created unique name list
nameIndex = 'name': ['Iowa', 'New York','San Jose', 'Houston','Arizona' ],
'index': [0,1,2,3,4]
# Now I want to replace source and target's value(name) with index which is in nameIndex(0,1,2,3,4)
# I have option to go with for loop but wnat to avoid it. Therefore not giving here loop solutions
在这里,我想用索引替换“source”和“traget”列中的名称。如何使用数据框功能实现它? 我的预期数据是:
data = 'source': ['0','1','2','3','3' ],
'target' :['1', '2', '0', '2', '4']
【问题讨论】:
【参考方案1】:您可以将nameIndex
列表转换为字典并使用.map
:
nameIndex = k: v for k, v in zip(nameIndex["name"], nameIndex["index"])
dataDf["Source"] = dataDf["Source"].map(nameIndex)
dataDf["Target"] = dataDf["Target"].map(nameIndex)
print(dataDf)
打印:
Source Target value
0 0 1 1
1 1 2 1
2 2 0 1
3 3 2 1
4 3 4 1
【讨论】:
以上是关于数据框列值映射的主要内容,如果未能解决你的问题,请参考以下文章