填写一列日期值,直到达到另一个日期值,然后继续填充新达到的值

Posted

技术标签:

【中文标题】填写一列日期值,直到达到另一个日期值,然后继续填充新达到的值【英文标题】:Fill down a column date value until another date value is reached, then continue filling with the newly reached value 【发布时间】:2018-06-10 10:28:35 【问题描述】:

我有以下数据框:

         Date                 Team 1                Team 2  Score1  Score2
0    1-Oct-17                      1                   NaN       2     NaN
1       21:20          Chicago Cubs        Cincinnati Reds       1     3.0
2       21:15    Kansas City Royals   Arizona Diamondbacks       2    14.0
3       21:15    St.Louis Cardinals      Milwaukee Brewers       1     6.0
4   30-Sep-17                      1                   NaN       2     NaN
5       22:15     St.Louis Cardinals     Milwaukee Brewers       7     6.0
6       22:05           Chicago Cubs       Cincinnati Reds       9     0.0
7       22:05  San Francisco Giants       San Diego Padres       2     3.0
8       19:05         Boston Red Sox        Houston Astros       6     3.0
9   29-Sep-17                      1                   NaN       2     NaN
10      20:20           Chicago Cubs       Cincinnati Reds       5     4.0
11      19:05       New York Yankees     Toronto Blue Jays       4     0.0
12       2:15    Kansas City Royals         Detroit Tigers       1     4.0
13       2:10      Chicago White Sox    Los Angeles Angels       5     4.0

为了得到这个结果,我需要填写日期值并替换时间值。

         Date                 Team 1                Team 2  Score1  Score2
0    1-Oct-17                      1                   NaN       2     NaN
1    1-Oct-17          Chicago Cubs        Cincinnati Reds       1     3.0
2    1-Oct-17    Kansas City Royals   Arizona Diamondbacks       2    14.0
3    1-Oct-17    St.Louis Cardinals      Milwaukee Brewers       1     6.0
4   30-Sep-17                      1                   NaN       2     NaN
5   30-Sep-17     St.Louis Cardinals     Milwaukee Brewers       7     6.0
6   30-Sep-17           Chicago Cubs       Cincinnati Reds       9     0.0
7   30-Sep-17  San Francisco Giants       San Diego Padres       2     3.0
8   30-Sep-17         Boston Red Sox        Houston Astros       6     3.0
9   29-Sep-17                      1                   NaN       2     NaN
10  29-Sep-17           Chicago Cubs       Cincinnati Reds       5     4.0
11  29-Sep-17       New York Yankees     Toronto Blue Jays       4     0.0
12  29-Sep-17    Kansas City Royals         Detroit Tigers       1     4.0
13  29-Sep-17      Chicago White Sox    Los Angeles Angels       5     4.0

【问题讨论】:

【参考方案1】:

您可以检查Date 列中值的长度,如果高于7,则将NaN 替换为where,最后通过ffill 前向填充缺失值(fillna 使用方法ffill) :

df['Date'] = df['Date'].where(df['Date'].str.len() > 7).ffill()
#similar idea
#df['Date'] = df['Date'].mask(df['Date'].str.len().isin([4,5])).ffill()
print (df)
         Date                Team 1                Team 2  Score1  Score2
0    1-Oct-17                     1                   NaN       2     NaN
1    1-Oct-17          Chicago Cubs       Cincinnati Reds       1     3.0
2    1-Oct-17    Kansas City Royals  Arizona Diamondbacks       2    14.0
3    1-Oct-17    St.Louis Cardinals     Milwaukee Brewers       1     6.0
4   30-Sep-17                     1                   NaN       2     NaN
5   30-Sep-17    St.Louis Cardinals     Milwaukee Brewers       7     6.0
6   30-Sep-17          Chicago Cubs       Cincinnati Reds       9     0.0
7   30-Sep-17  San Francisco Giants      San Diego Padres       2     3.0
8   30-Sep-17        Boston Red Sox        Houston Astros       6     3.0
9   29-Sep-17                     1                   NaN       2     NaN
10  29-Sep-17          Chicago Cubs       Cincinnati Reds       5     4.0
11  29-Sep-17      New York Yankees     Toronto Blue Jays       4     0.0
12  29-Sep-17    Kansas City Royals        Detroit Tigers       1     4.0
13  29-Sep-17     Chicago White Sox    Los Angeles Angels       5     4.0

另一个想法是将值转换为日期时间并比较 0:00 时间:

from datetime import time

df['Date']  = pd.to_datetime(df['Date'] )
df['Date'] = df['Date'].where(df['Date'].dt.time == time(0,0)).ffill()
print (df)
         Date                Team 1                Team 2  Score1  Score2
0  2017-10-01                     1                   NaN       2     NaN
1  2017-10-01          Chicago Cubs       Cincinnati Reds       1     3.0
2  2017-10-01    Kansas City Royals  Arizona Diamondbacks       2    14.0
3  2017-10-01    St.Louis Cardinals     Milwaukee Brewers       1     6.0
4  2017-09-30                     1                   NaN       2     NaN
5  2017-09-30    St.Louis Cardinals     Milwaukee Brewers       7     6.0
6  2017-09-30          Chicago Cubs       Cincinnati Reds       9     0.0
7  2017-09-30  San Francisco Giants      San Diego Padres       2     3.0
8  2017-09-30        Boston Red Sox        Houston Astros       6     3.0
9  2017-09-29                     1                   NaN       2     NaN
10 2017-09-29          Chicago Cubs       Cincinnati Reds       5     4.0
11 2017-09-29      New York Yankees     Toronto Blue Jays       4     0.0
12 2017-09-29    Kansas City Royals        Detroit Tigers       1     4.0
13 2017-09-29     Chicago White Sox    Los Angeles Angels       5     4.0

【讨论】:

以上是关于填写一列日期值,直到达到另一个日期值,然后继续填充新达到的值的主要内容,如果未能解决你的问题,请参考以下文章

每个月的 Excel 自动填充值

透视列以填充另一列中的值

如果我有重复的日期,如何用 pandas 中两个日期之间计算的值填充一列?

根据另一列唯一值填充另一列

根据另一列的值修改熊猫中的日期时间列

根据另一列的先前值填充一列