我想在“pd.Series(a,index=).unique”代码中保留索引

Posted 2023-03-12

技术标签:

【中文标题】我想在“pd.Series(a,index=).unique”代码中保留索引【英文标题】：I wanna keep index in "pd.Series(a,index=).unique" code 【发布时间】：2021-02-10 10:25:52 【问题描述】：

pd.Series(a).unique() 有问题

我做了一个Series，我用了.unique()。

但是，这会删除 pd.Series 索引。

如何用原始索引制作唯一的数组？

【问题讨论】：

使用.drop_duplicates 【参考方案1】：

您可以使用.drop_duplicates()，而不是使用.unique()：

x = pd.Series([1,2,3,1,1,2,4,5,6], index=list("abcdefghi"))

print(x)
a    1
b    2
c    3
d    1
e    1
f    2
g    4
h    5
i    6
dtype: int64

.drop_duplicates() 将从系列中删除所有重复项，同时保持对索引的引用。您可以通过keep 参数选择是否要保留“第一个”或“最后一个”重复项的索引位置：

# Keep the first entry of each duplicated value
x.drop_duplicates(keep="first")

a    1
b    2
c    3
g    4
h    5
i    6
dtype: int64

# Keep the last entry of each duplicated item
x.drop_duplicates(keep="last")

c    3
e    1
f    2
g    4
h    5
i    6
dtype: int64

【讨论】：

以上是关于我想在“pd.Series(a,index=).unique”代码中保留索引的主要内容，如果未能解决你的问题，请参考以下文章