Pandas只能将大小为1的数组转换为Python标量

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Pandas只能将大小为1的数组转换为Python标量相关的知识,希望对你有一定的参考价值。

我有这个数据框,df_pm

                             Player  GameWeek  Minutes  \
PlayerMatchesDetailID                                                 
1             Alisson         1       90   
2     Virgil van Dijk         1       90   
3        Joseph Gomez         1       90 

                             ForTeam               AgainstTeam  \
1                             Liverpool              Norwich City   
2                             Liverpool              Norwich City   
3                             Liverpool              Norwich City  

                             Goals  ShotsOnTarget  ShotsInBox  CloseShots  \
1                             0              0           0           0   
2                             1              1           1           1   
3                             0              0           0           0 
                     TotalShots  Headers  GoalAssists  ShotOnTargetCreated  \
1                             0        0            0                    0   
2                             1        1            0                    0   
3                             0        0            0                    0   
                       ShotInBoxCreated  CloseShotCreated  TotalShotCreated  \
1                             0                 0                 0   
2                             0                 0                 0   
3                             0                 0                 1  
                         HeadersCreated  
1                             0  
2                             0  
3                             0 

第二个数据帧,df_melt

    MatchID GameWeek        Date                      Team  Home  \
0     46605        1  2019-08-09                 Liverpool  Home   
1     46605        1  2019-08-09              Norwich City  Away   
2     46606        1  2019-08-10           AFC Bournemouth  Home  

                  AgainstTeam  
0                Norwich City  
1                   Liverpool  
2            Sheffield United  
3             AFC Bournemouth  
...
575          Sheffield United  
576          Newcastle United  
577               Southampton

以及此代码段,同时使用这两个代码:

match_ids = []
home_away = []
dates = []

#For each row in the player matches dataframe...
for row in df_pm.itertuples():
    #Look up the match id from the team matches dataframe
    team = row.ForTeam
    againstteam = row.AgainstTeam
    gameweek = row.GameWeek
    print (team,againstteam,gameweek)

    match_id = df_melt.loc[(df_melt['GameWeek']==gameweek)
                          &(df_melt['Team']==team)
                          &(df_melt['AgainstTeam']==againstteam),
                          'MatchID'].item()

    date = df_melt.loc[(df_melt['GameWeek']==gameweek)
                          &(df_melt['Team']==team)
                          &(df_melt['AgainstTeam']==againstteam),
                          'Date'].item()

    home = df_melt.loc[(df_melt['GameWeek']==gameweek)
                          &(df_melt['Team']==team)
                          &(df_melt['AgainstTeam']==againstteam),
                          'Home'].item()

    match_ids.append(match_id)
    home_away.append(home)
    dates.append(date)

第一次迭代时,我打印:

Liverpool
Norwich City
1

但是我遇到了错误:

Traceback (most recent call last):
  File "tableau_data_generation.py", line 166, in <module>
    'MatchID'].item()
  File "/Users/me/anaconda2/envs/data_science/lib/python3.7/site-packages/pandas/core/base.py", line 652, in item
    return self.values.item()
ValueError: can only convert an array of size 1 to a Python scalar

打印整个df_melt数据帧,我发现这四个日期时间值有缺陷:

540   46875       28         TBC               Aston Villa  Home   
541   46875       28         TBC          Sheffield United  Away   
...
548   46879       28         TBC           Manchester City  Home   
549   46879       28         TBC                   Arsenal  Away  

我该如何解决?

答案

当您在Series上使用item()时,您实际上应该已经收到:

FutureWarning: `item` has been deprecated and will be removed in a future version

由于item()在版本[[0.25.0中已弃用,因此看起来您使用了Pandas的某些版本已过时,可能您应该从升级开始。

即使在

Pandas

的较新版本中,您也可以使用item(),但在Numpy数组(至少现在不建议弃用)。因此,将代码更改为:df_melt.loc[...].values.item()
另一个选择是使用

iloc [0]

,因此您也可以将代码更改为:df_melt.loc[...].iloc[0]
编辑

如果

df_melt

,上述解决方案仍然会引发异常(IndexError)找不到符合给定条件的任何行。为了使您的代码可以抵抗这种情况(并返回一些默认值)您可以添加获取给定属性(

attr

的函数,实际上是符合指定条件的第一行)(游戏周团队,和againstteam):def getAttr(gameweek, team, againstteam, attr, default=None): xx = df_melt.loc[(df_melt['GameWeek'] == gameweek) & (df_melt['Team'] == team) & (df_melt['AgainstTeam'] == againstteam)] return xx.iloc[0].loc[attr] if ~xx.empty else default
然后,而不是运行所有3条... = df_melt.loc[...].item()指令:

match_id = getAttr(gameweek, team, againstteam, 'MatchID', default=-1) date = getAttr(gameweek, team, againstteam, 'Date') home = getAttr(gameweek, team, againstteam, 'Home', default='????')

以上是关于Pandas只能将大小为1的数组转换为Python标量的主要内容,如果未能解决你的问题,请参考以下文章

TypeError:尝试绘制函数时,只能将大小为 1 的数组转换为 Python 标量

绘制Vasicek模型。只能将大小为-1的数组转换为python标量

如何使用 PANDAS / Python 将矩阵转换为列数组

python 将Numpy数组转换为Pandas Dataframe

TypeError:使用 trapz 方法时,只能将 size-1 数组转换为 Python 标量

使用 python 和 pandas 将多数组 json 数据转换为扁平数据框