ValueError:不能将现有列的名称用于指示符列
Posted
技术标签:
【中文标题】ValueError:不能将现有列的名称用于指示符列【英文标题】:ValueError: Cannot use name of an existing column for indicator column 【发布时间】:2018-07-18 01:38:39 【问题描述】:我需要解决一个问题,我将有一个数据框,比如 df,名称和年龄,我需要在 for 循环中生成另一个名称和性别的数据框,我需要合并 for 中生成的数据框与 df 循环以获取 df 中的性别。所以我在解决我的问题之前尝试了下面的代码
import pandas as pd
d = 'Age': [45, 38], 'Name': ['John', 'Emily']
df = pd.DataFrame(data=d)
d1='Gender':['M'],'Name':['John']
df1=pd.DataFrame(data=d1)
df3 = df.merge(df1, on=['Name'], how='left', indicator=True)
df3
d2='Gender':['F'],'Name':['Emily']
df4=pd.DataFrame(data=d2)
df5=df3.merge(df4, on=['Name'], how='left', indicator=True)
我在运行最后一行时遇到以下错误。
"Cannot use name of an existing column for indicator column")
ValueError: Cannot use name of an existing column for indicator column
你能建议我如何在 python 3.x 中解决这个问题吗?
【问题讨论】:
【参考方案1】:有更好的方法来完成你想做的事情(正如另一个人回答的那样)。但要了解您收到错误的原因,请阅读以下内容。
因为您进行了一次合并,所以您现在在 df3.xml 中有一个名为 _merge
的列。当你再次合并时,你不能再创建另一个_merge
。
顺便说一句,为了将来参考,现在你有indicator=True
,但你也可以传入一个字符串,例如indicator='exists'
,然后你的“指示”你如何加入的新列将被称为exists
,你可以通过df5['exists']
选择它
查看这个简单的示例并在 repl
中浏览它
>>> df1
col1 col2
0 a b
1 b c
2 d e
>>> df2
col1 col2
0 a b
1 b c
>>> df1.merge(df2, on='col1', how='left', indicator=True)
col1 col2_x col2_y _merge
0 a b b both
1 b c c both
2 d e NaN left_only
>>> df3 = df1.merge(df2, on='col1', how='left', indicator=True)
>>> df4 = pd.DataFrame([['d', 'e']], columns=['col1', 'col2'])
>>> df3.merge(df4, on='col1', how='left', indicator=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/devtools/uat/anaconda4321/lib/python3.6/site-packages/pandas/core/frame.py", line 4722, in merge
copy=copy, indicator=indicator)
File "/usr/local/devtools/uat/anaconda4321/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 54, in merge
return op.get_result()
File "/usr/local/devtools/uat/anaconda4321/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 567, in get_result
self.left, self.right)
File "/usr/local/devtools/uat/anaconda4321/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 605, in _indicator_pre_merge
"Cannot use name of an existing column for indicator column")
ValueError: Cannot use name of an existing column for indicator column
>>> df3.merge(df4, on='col1', how='left', indicator='exists')
col1 col2_x col2_y _merge col2 exists
0 a b b both NaN left_only
1 b c c both NaN left_only
2 d e NaN left_only e both
【讨论】:
【参考方案2】:我将采取与其他人想象的不同的方式。我会用map()
# merging both gender dataframes together for convenience
gender = pd.concat([df1,df4])
# creating a column the same as 'Name' but calling it gender
df['Gender'] = df['Name']
# creating a dictionary with the name as the key, and gender as value
gender_dict = gender.set_index('Name')['Gender'].to_dict()
# output as 'Emily': 'F', 'John': 'M'
# remapping the name in place of the gender
df['Gender'] = df['Gender'].map(gender_dict)
Age Name Gender
0 45 John M
1 38 Emily F
【讨论】:
以上是关于ValueError:不能将现有列的名称用于指示符列的主要内容,如果未能解决你的问题,请参考以下文章