合并熊猫数据框的列时出错
Posted
技术标签:
【中文标题】合并熊猫数据框的列时出错【英文标题】:Error when merging columns of a pandas dataframe 【发布时间】:2018-02-16 18:58:27 【问题描述】:我有这个数据框:
Telefone1 Telefone2
CNPJ
44167450000149 1332385314 1332385314
56095862000108 2125439090 2125439090
59664391000191 1143990005 1143990005
我想将“Telefone1”和“Telefone2”合并为一列。它应该是这样的:
Telefone
CNPJ
44167450000149 1332385314,1332385314
56095862000108 2125439090,2125439090
59664391000191 1143990005,1143990005
为此,我正在使用这个:
df['Telefone']=df.Telefone1.astype(str)+","+df.Telefone2.astype(str)
我得到了这个回溯:
Traceback (most recent call last):
File "/file.py", line 507, in <module>
'file')
File "file.py", line 347, in function
df['Telefone']=df.Telefone1.astype(str)+","+df.Telefone2.astype(str)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/frame.py", line 2357, in __setitem__
self._set_item(key, value)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/frame.py", line 2424, in _set_item
NDFrame._set_item(self, key, value)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/generic.py", line 1464, in _set_item
self._data.set(key, value)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/internals.py", line 3418, in set
self.insert(len(self.items), item, value)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/internals.py", line 3519, in insert
placement=slice(loc, loc + 1))
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/internals.py", line 2518, in make_block
return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/internals.py", line 1663, in __init__
placement=placement, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/internals.py", line 90, in __init__
len(self.mgr_locs)))
ValueError: Wrong number of items passed 4, placement implies 1
我在这里做错了什么?
【问题讨论】:
您的列名中有错字。两列名称为'Telefone1'
,但您尝试添加'Telefone1'
和'Telefone2'
。当我对此进行更正时,您的代码运行良好。
我认为你应该使用 .map 函数而不是 astype,请试试这个代码:dataframe["Telefone"] = df["Telefone1"].map(str) + df["Telefone1"]并且您的列名是 telefone1
@piRSquared 实际上我在 *** 上提出了这个错字。这是我在此处输入时犯的一个错误(已在此处修复)。在我的代码中它是正确的,并且它正在获取该回溯。
【参考方案1】:
>>> (df.iloc[:, 0].astype(str) + ',' + df.iloc[:, 1].astype(str)).to_frame('Telephone')
Telefone
CNPJ
44167450000149 1332385314,1332385314
56095862000108 2125439090,2125439090
59664391000191 1143990005,1143990005
或:
(df.loc[:, 'Telefone1'].astype(str) + ',' + df.loc[:, 'Telefone2'].astype(str)).to_frame('Telefone'))
这适用于您的示例数据。如果有错误,则创建一个新列来指示每个字段的长度并按此值排序。可能会出现数据错误。
【讨论】:
【参考方案2】:df = df.applymap(str)
选项 1
str.cat
df = pd.DataFrame('Telefone' : df.Telefone1.str.cat(df.Telefone2, sep=','), index=df.index)
df
Telefone
CNPJ
44167450000149 1332385314,1332385314
56095862000108 2125439090,2125439090
59664391000191 1143990005,1143990005
选项 2
df.apply
df = df.apply(','.join, 1).to_frame(name='Telefone')
df
Telefone
CNPJ
44167450000149 1332385314,1332385314
56095862000108 2125439090,2125439090
59664391000191 1143990005,1143990005
【讨论】:
【参考方案3】:使用字符串访问器的cat()函数:
df = df.astype(str)
df['Telefone'] = df['Telefone1'].str.cat(df['Telefone2'])
【讨论】:
谢谢,但这将我带到另一个错误:AttributeError: 'DataFrame' object has no attribute 'str'以上是关于合并熊猫数据框的列时出错的主要内容,如果未能解决你的问题,请参考以下文章