在 Python Pandas 中使用 DatetimeIndex 切片 MultIndex 帧时出现 1.1.0 以上的 InvalidIndexError
Posted
技术标签:
【中文标题】在 Python Pandas 中使用 DatetimeIndex 切片 MultIndex 帧时出现 1.1.0 以上的 InvalidIndexError【英文标题】:in Python Pandas above 1.1.0 InvalidIndexError when slicing MultIndex frame with DatetimeIndex 【发布时间】:2021-02-02 03:13:15 【问题描述】:我的数据包含多个区域的时间线值。我想根据日期切片。
这是我的 MultIndex 数据框,我叫 Bob:
arrays = [[1,1,2,2],
['2020-01-06', '2020-01-13','2020-01-06', '2020-01-13']]
df = pd.DataFrame(np.transpose(arrays))
df[1] = pd.to_datetime(df[1])
index = pd.MultiIndex.from_frame(df, names=['zone', 'date'])
bob = pd.Series(np.random.randn(4), index=index)
print(bob)
zone date
1 2020-01-06 -0.513744
2020-01-13 1.367461
2 2020-01-06 0.209916
2020-01-13 0.397261
现在,我想从单个日期时间索引中获取切片,并使用它来获取 Bob 的切片。 以下代码在 Pandas 1.0.1(可能更旧)中有效,但在 1.1 中中断
print(pd.__version__)
singleIndex = pd.to_datetime(pd.Index(['2020-01-06', '2020-01-13']))
dateSlice = singleIndex[1:]
print(dateSlice)
idx = pd.IndexSlice
print(bob.loc[idx[:,dateSlice]])
1.0.1
DatetimeIndex(['2020-01-13'], dtype='datetime64[ns]', freq=None)
zone date
1 2020-01-13 1.367461
2 2020-01-13 0.397261
dtype: float64
在 1.1.0 中
1.1.0
DatetimeIndex(['2020-01-13'], dtype='datetime64[ns]', freq=None)
Traceback (most recent call last):
File "try.py", line 18, in <module>
print(bob.loc[idx[:,dateSlice]])
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 873, in __getitem__
return self._getitem_tuple(key)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 1044, in _getitem_tuple
return self._getitem_lowerdim(tup)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 766, in _getitem_lowerdim
return self._getitem_nested_tuple(tup)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 826, in _getitem_nested_tuple
result = self._handle_lowerdim_multi_index_axis0(tup)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 1066, in _handle_lowerdim_multi_index_axis0
return self._get_label(tup, axis=axis)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 1059, in _get_label
return self.obj.xs(label, axis=axis)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/generic.py", line 3480, in xs
loc, new_index = self.index.get_loc_level(key, drop_level=drop_level)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 2858, in get_loc_level
k = self._get_level_indexer(k, level=i)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 2965, in _get_level_indexer
code = self._get_loc_single_level_index(level_index, key)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 2634, in _get_loc_single_level_index
return level_index.get_loc(key)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/datetimes.py", line 586, in get_loc
raise InvalidIndexError(key)
pandas.errors.InvalidIndexError: DatetimeIndex(['2020-01-13'], dtype='datetime64[ns]', freq=None)
我承认 MultiIndex 让我非常困惑。如何在新的 pandas 中正确地进行切片?
【问题讨论】:
【参考方案1】:似乎不再支持,您可以在boolean indexing
中使用Index.get_level_values
和Index.isin
替代:
print(bob[bob.index.get_level_values(1).isin(dateSlice)])
zone date
1 2020-01-13 -1.396496
2 2020-01-13 -0.504466
dtype: float64
对于一个字符串值,它的工作方式有点不同:
print(bob.loc[idx[:,'2020-01-13']])
zone
1 -0.200758
2 0.410052
dtype: float64
print(bob.xs('2020-01-13', level=1, drop_level=False))
zone date
1 2020-01-13 1.129484
2 2020-01-13 0.185156
dtype: float64
【讨论】:
以上是关于在 Python Pandas 中使用 DatetimeIndex 切片 MultIndex 帧时出现 1.1.0 以上的 InvalidIndexError的主要内容,如果未能解决你的问题,请参考以下文章
根据附加的字典列表在 df 中创建新列并遍历字典 Pandas 列表