为啥我收到此错误“ValueError：无法从重复轴重新索引”？

Posted 2023-02-23

技术标签:

【中文标题】为啥我收到此错误“ValueError：无法从重复轴重新索引”？【英文标题】：Why i'm getting this error "ValueError: cannot reindex from a duplicate axis"?为什么我收到此错误“ValueError：无法从重复轴重新索引”？ 【发布时间】：2019-04-01 04:23:37 【问题描述】：

在这里，我只是提出了引发错误的代码部分。在这里，我连接了两组不同的数据帧，它们附加在两个不同的列表中。

path1 = '/home/Desktop/computed_2d_blaze/'
path2 = '/home/Desktop/computed_1d/'
path3 = '/home/Desktop/sn_airmass_seeing/'

dir1 = [x for x in os.listdir(path1) if '.ares' in x]
dir2 = [x for x in os.listdir(path2) if '.ares' in x]
dir3 = [x for x in os.listdir(path3) if '.ares' in x]

lst = []
lst1 = []

for file1, file2,file3 in zip(dir1,dir2,dir3):
   df1 = pd.read_table(path1+file1, skiprows=0, usecols=(0,1,2,3,4,8),names=['wave','num','stlines','fwhm','EWs','MeasredWave'],delimiter=r'\s+')
   df2 = pd.read_table(path2+file2, skiprows=0, usecols=(0,1,2,3,4,8),names=['wave','num','stlines','fwhm','EWs','MeasredWave'],delimiter=r'\s+')

   df1 = df1.groupby('wave').mean().reset_index()
   df1 = df1.sort_values('wave').reset_index(drop=True)
   df2 = df2.sort_values('wave').reset_index(drop=True)

   dfs = pd.merge(df1,df2, on='wave', how='inner')
   dfs['delta_ew'] = (dfs.EWs_x - dfs.EWs_y)
   dfs=dfs.filter(items=['wave','delta_ew'])
   lst.append(dfs)

   df3 = pd.read_table(path3+file3, skiprows=0, usecols=(0,1,2),names=['seeing','airmass','snr'],delimiter=r'\s+')
   lst1.append(df3)

[df.set_index('wave', inplace=True) for df in lst]
df=pd.concat(lst,axis=1,join='inner')

x = pd.concat(lst1,axis=1,join='inner')

for z in df.index:
   t = x.loc[0, 'airmass']
   s = df.loc[z, 'delta_ew']
   dfs = pd.concat([s,t],axis=1,names=['delta_ew','airmass'])
   dfs = dfs[np.abs(dfs.delta_ew - dfs.delta_ews.mean()) <= (dfs.delta_ews.mad())]

当我尝试创建一个新的数据框时，delta_ew 中有一些异常值，所以为了删除它们，我正在这样做。但是当试图这样做时，我得到了这个错误ValueError: cannot reindex from a duplicate axis。

我不明白如何解决这个错误。谁能告诉我哪里出错了？

这里是完整的追溯

 Traceback (most recent call last):
  File "/home/gyanender/Desktop/r_values/airmass_vs_ew/delta_ew/for_rvalues.py", line 72, in <module>
    dfs = pd.concat([s,t],axis=1,names=['delta_ew','airmass'])
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/reshape/concat.py", line 213, in concat
    return op.get_result()
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/reshape/concat.py", line 385, in get_result
    df = cons(data, index=index)
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 330, in __init__
    mgr = self._init_dict(data, index, columns, dtype=dtype)
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 461, in _init_dict
    return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 6168, in _arrays_to_mgr
    arrays = _homogenize(arrays, index, dtype)
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 6465, in _homogenize
    v = v.reindex(index, copy=False)
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/series.py", line 2681, in reindex
    return super(Series, self).reindex(index=index, **kwargs)
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/generic.py", line 3023, in reindex
    fill_value, copy).__finalize__(self)
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/generic.py", line 3041, in _reindex_axes
    copy=copy, allow_dups=False)
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/generic.py", line 3145, in _reindex_with_indexers
    copy=copy)
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/internals.py", line 4139, in reindex_indexer
    self.axes[axis]._can_reindex(indexer)
  File "/home/gyanender/.local/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 2944, in _can_reindex
    raise ValueError("cannot reindex from a duplicate axis")
ValueError: cannot reindex from a duplicate axis

【问题讨论】：

What does `ValueError: cannot reindex from a duplicate axis` mean?的可能重复我刚刚发布了错误的屏幕截图。 @jpp 你现在可以检查一下吗？ @jpp 最后，这绝对是好得多。我无法发布图片，那么如何提供完整的回溯？ 【参考方案1】：

我终于设法解决了这个问题。而不是concat，我使用dictionary。因为我面临的问题是连接两个熊猫系列来制作新的数据框。我首先将熊猫系列t & s 的值转换为字典，然后将该字典转换为数据框，它对我来说非常好用。

for z in df.index:
   t = x.loc[0, 'airmass']
   t = t.values
   s = df.loc[z, 'delta_ew']
   s = s.values
   dic = dict(zip(s,t))      
   q = pd.DataFrame(dic.items(), columns=['ew', 'airmass'])
   q = q[np.abs(q.ew - q.ew.mean()) <= (q.ew.mad())]

【讨论】：

【参考方案2】：

当您加入/分配到索引具有重复值的列时，通常会出现此错误。

错误是从dfs = pd.concat([s,t],axis=1,names=['delta_ew','airmass']) 代码引发的。我相信我找到了解决您问题的方法。只需将ignore_index=True 添加到concat 代码即可。

像这样：

dfs = pd.concat([s,t],axis=1,names=['delta_ew','airmass'], ignore_index=True )

这将重新创建索引。

注意：index 表示行和列名称

【讨论】：

代替 loc 试试 ix 这样：t = x.ix[0, 'airmass'] s = df.ix[z, 'delta_ew'] 嘿，问题出在这一行 dfs = pd.concat([s,t],axis=1,names=['delta_ew','airmass']) 如果我不使用 axis=1 它会正确连接。但我的问题是我想将它连接到 axis=1 我知道它的原因，当您使用axis=1时，除了列名之外，它还会更改索引先生的支持并不重要。！我被困住了，我必须解决这个问题。我认为还有另一种方法可以解决但不知道是否有任何函数或类似 q = pd.DataFrame([s,t],ignore_index=True) 这可以解决这个问题，但不幸的是我找不到任何东西。

以上是关于为啥我收到此错误“ValueError：无法从重复轴重新索引”？的主要内容，如果未能解决你的问题，请参考以下文章