如何在python中查找给定日期每周的总播放时间?
Posted
技术标签:
【中文标题】如何在python中查找给定日期每周的总播放时间?【英文标题】:how to find total play time of each week for the given date in python? 【发布时间】:2021-07-17 07:22:18 【问题描述】:我有一个如下所示的数据框
k='user_id':[1,1,1,1,1,2,2,2,3,3,3,3,3,4,4,4,5,5],
'created':[ '2/09/2021','2/10/2021','2/16/2021','2/17/2021','3/09/2021','3/10/2021','3/18/2021','3/19/2021',
'2/19/2021','2/20/2021','2/26/2021','2/27/2021','3/09/2021','2/10/2021','2/18/2021','3/19/2021',
'3/24/2021','3/30/2021',],
'stop_time':[11,12,13,14,15,25,26,27,6,7,8,9,10,11,12,13,25,26],
'play_time':[10,11,12,13,14,24,25,26,5,6,7,8,9,10,11,13,24,25]
df=pd.DataFrame(data=k)
df['created']=pd.to_datetime(df['created'], format='%m/%d/%Y')
df['total_play_time'] = df['stop_time'] - df['play_time']
现在我们需要使用每个 user_id 的第一个日期作为第一周的开始日期,例如我们需要选择 '2/9/2021' 是 user_id 1 的第一周开始日期和 '3/09 /2021' 作为 user_id 2 的第一周开始日期。
我们需要将 user_id 每周的总游戏时间相加,它继续给出每个时间的总和,直到当前日期(例如,如果运行报告到今天,它必须给出每周的总和,直到今天)并给出如下结果
ID week1 week2 week3 week4 week5 week6 week7 week8 week9 week10 week11 week12
1 3 2 0 0 0 0 0 0 0 0 0 0
2 1 2 0 0 0 0 0
【问题讨论】:
请通过intro tour、help center 和how to ask a good question 了解本网站的工作原理并帮助您改进当前和未来的问题,从而帮助您获得更好的答案。 “告诉我如何解决这个编码问题?”与 Stack Overflow 无关。您必须诚实地尝试解决方案,然后就您的实施提出具体问题。 Stack Overflow 无意取代现有的教程和文档。 【参考方案1】:# Get a list of unique id's
user_ids = df["user_id"].unique()
# Get the start date of each user
start_dates = [min(df[df["user_id"]==usr]["created"]) for usr in user_ids]
# We will subtract the start date to have a common baseline for all users
df["time_since_start"] = None
for i, usr in enumerate(user_ids):
df.loc[df["user_id"]==usr,"time_since_start"] = df.loc[df["user_id"]==usr,"created"] - start_dates[i]
# we got a Timedelta object, but its more useful as a float
df['t'] = [x.value for x in df["time_since_start"]]
# get the maximum time any user has ever ..played? to make our bins
max_time = df["time_since_start"].max()
# convert it from microseconds to weeks, rounding up
max_weeks = int(np.ceil(max_time.value/8.64e+13/7))
# make the bins and add corresponding readable labels
bins = [pd.Timedelta(weeks = wk).value for wk in range(max_weeks+1)]
labels = ["week " + str(wk+1) for wk in range(max_weeks)]
# bin the data and aggregate the result
df["bin"] = pd.cut(df['t'], bins, labels = labels)
df.groupby(['user_id','bin'])['total_play_time'].sum()
user_id bin
1 week 1 2
week 2 1
week 3 0
week 4 1
week 5 0
week 6 0
2 week 1 0
week 2 2
week 3 0
week 4 0
week 5 0
week 6 0
3 week 1 2
week 2 1
week 3 1
week 4 0
week 5 0
week 6 0
4 week 1 0
week 2 1
week 3 0
week 4 0
week 5 0
week 6 0
5 week 1 1
week 2 0
week 3 0
week 4 0
week 5 0
week 6 0
Name: total_play_time, dtype: int64
如果确实需要,您可以将数据框重新调整为宽格式。
【讨论】:
运行您的代码时出现以下错误。 AttributeError Traceback(最近一次调用最后)以上是关于如何在python中查找给定日期每周的总播放时间?的主要内容,如果未能解决你的问题,请参考以下文章
如何仅查询 HealthKit 以获取给定日期的总“在床上”时间?
如何仅查询 HealthKit 以获取给定日期的总“在床上”时间?