如何在for循环中一次访问两个元素而不会在python中重复?
Posted
技术标签:
【中文标题】如何在for循环中一次访问两个元素而不会在python中重复?【英文标题】:How to access two elements at once in a for loop without duplicates in python? 【发布时间】:2021-06-14 08:36:34 【问题描述】:我有一个如下所示的表格:
Celebrity | Username |
---|---|
A | user1 |
B | user1 |
C | user2 |
A | user3 |
A | user2 |
D | user2 |
D | user3 |
我写了一个函数来查找两个名人之间的用户重叠:
def num_of_fans_overlap(cel1,cel2,data,Celebrity,Usernames):
l = [cel1,cel2]
Res = len(data.loc[data['Usernames'].map(data.groupby('Usernames').agg(set)['Celebrity'].eq(set(l)))])/2
return print(int(Res))
例如,如果我运行 num_of_fans_overlap(A,B,data,"Celebrity","Username"),我会得到 1,这意味着一个用户关注了两个名人。
现在我想运行一个 for 循环,输出应该是这样的:
("A", "B", 1)
("A", "C", 1)
("A", "D", 2)
("B", "C", 0)
("B", "D", 0)
("C", "D", 1)
I have been stuck here. Hope someone can help.
【问题讨论】:
【参考方案1】:这样做的幼稚方法:
celebs = ["A", "B", "C", "D"]
for i in range(len(celebs)):
for j in range(i+1, len(celebs)):
celeb_pairs = (celebs[i], celebs[j])
run_your_function_here(*celeb_pairs, other, parameters)
您也可以使用itertools.combinations
函数优雅地做到这一点:
import itertools
celebs = ["A", "B", "C", "D"]
for celeb1, celeb2 in itertools.combinations(celebs, 2):
run_your_function_here
【讨论】:
【参考方案2】:检查crosstab
然后dot
s = pd.crosstab(df.Celebrity,df.Username)
s = s.dot(s.T)
out = s.mask(np.triu(np.ones(s.shape)).astype(bool)).stack()
Out[301]:
Celebrity Celebrity
B A 1.0
C A 1.0
B 0.0
D A 2.0
B 0.0
C 1.0
dtype: float64
【讨论】:
哇,虽然这不是我想要的,但它看起来非常有用。我可能会将此应用到我未来的工作中。非常感谢您的分享!【参考方案3】:首先,函数num_of_fans_overlap
不应返回print()
。
def num_of_fans_overlap(cel1,cel2,data):
l = [cel1,cel2]
Res = len(data.loc[data['Usernames'].map(data.groupby('Usernames').agg(set)['Celebrity'].eq(set(l)))])/2
return int(Res)
其次,如果变量celebrities
是Celebrity
列上的唯一值列表。
from itertools import combinations
celebrities = list(data.Celebrity.unique())
for (cel1, cel2) in combinations(celebrities, 2):
fans_overlap = num_of_fans_overlap(cel1, cel2, data)
print((cel1, cel2, fans_overlap))
【讨论】:
以上是关于如何在for循环中一次访问两个元素而不会在python中重复?的主要内容,如果未能解决你的问题,请参考以下文章