无法将列表转换为数据框。不断收到错误“ValueError: Must pass 2-d input.shape=(1, 4, 5)”
Posted
技术标签:
【中文标题】无法将列表转换为数据框。不断收到错误“ValueError: Must pass 2-d input.shape=(1, 4, 5)”【英文标题】:Unable to convert a list into a dataframe. Keep getting the error "ValueError: Must pass 2-d input. shape=(1, 4, 5)" 【发布时间】:2021-04-25 10:27:49 【问题描述】:我必须有 2 个 dfs:
dfMiss:
和
dfSuper:
我需要创建一个最终输出来总结 2 个表中的数据,我可以在下面的代码中显示:
dfCity = dfSuper \
.groupby(by='City').count() \
.drop(columns='Superhero ID') \
.rename(columns='Superhero': 'Total count')
print("This is the df city : ")
print(dfCity)
## Convert column MissionEndDate to DateTime format
for df in dfMiss:
# Dates are interpreted as MM/dd/yyyy by default, dayfirst=False
df['Mission End date'] = pd.to_datetime(df['Mission End date'], dayfirst=True)
# Get Year and Quarter, given Q1 2020 starts in April
date = df['Mission End date'] - pd.DateOffset(months=3)
df['Mission End quarter'] = date.dt.year.astype(str) + ' Q' + date.dt.quarter.astype(str)
## Get no. Superheros working per City per Quarter
dfCount = []
for dfM in dfMiss:
# Merge DataFrames
df = dfSuper.merge(dfM, left_on='Superhero ID', right_on='SID')
df = df.pivot_table(index=['City', 'Superhero'], columns='Mission End quarter', aggfunc='nunique')
# Get the first group (all the groups have the same values)
df = df[df.columns[0][0]]
# Group the values by City (effectively "collapsing" the 'Superhero' column)
df = df.groupby(by=['City']).count()
dfCount += [df]
## Get no. Superheros available per City per Quarter
dfFree = []
for dfC in dfCount:
# Merge DataFrames
df = dfCity.merge(right=dfC, on='City', how='outer').fillna(0) # convert NaN values to 0
# Subtract no. working superheros from total no. superheros per city
for col in df.columns[1:]:
df[col] = df['Total count'] - df[col]
dfFree += [df.astype(int)]
print(dfFree)
dfResult = pd.DataFrame(dfFree)
问题是当我尝试将 DfFree 转换为数据帧时出现错误:
"ValueError: 必须通过二维输入。shape=(1, 4, 5)"
引发错误的行是
dfResult = pd.DataFrame(dfFree)
任何人都知道这意味着什么以及如何将列表转换为 df?
谢谢:)
【问题讨论】:
什么代码实际上引发了错误? 最后一行:dfResult = pd.DataFrame(dfFree) 根据形状,传入dfFree[0]
哇,感谢疯狂的物理学家!为什么会这样?!
错误说你的形状有三个维度,第一个是一个。如果你沿着第一个维度取第一个元素,你会得到一个 2D 数组的余数
【参考方案1】:
使用 SOLID 分隔您的代码。关注点分离。读起来不容易
sid=[665544,665544,2121,665544,212121,123456,666666]
mission_end_date=["10/10/2020", "03/03/2021", "02/02/2021", "05/12/2020", "15/07/2021", "03/06/2021", "12/10/2020"]
superherod_sid=[212121,364331,678523,432432,665544,123456,555555,666666,432432]
hero=["Spiderman","Ironman","Batman","Dr. Strange","Thor","Superman","Nightwing","Loki","Wolverine"]
city=["New York","New York","Gotham","New York","Asgard","Metropolis","Gotham","Asgard","New York"]
df_mission=pd.DataFrame('sid':sid,'mission_end_date':mission_end_date)
df_super=pd.DataFrame('sid':superherod_sid,'hero':hero, 'city':city)
df=df_super.merge(df_mission,on="sid", how="left")
df['mission_end_date']=pd.to_datetime(df['mission_end_date'])
df['mission_end_date_quarter']=df['mission_end_date'].dt.quarter
df['mission_end_date_year']=df['mission_end_date'].dt.year
print(df.head(20))
pivot = df.pivot_table(index=['city', 'hero'], columns='mission_end_date_quarter', aggfunc='nunique').fillna(0)
print(pivot.head())
【讨论】:
以上是关于无法将列表转换为数据框。不断收到错误“ValueError: Must pass 2-d input.shape=(1, 4, 5)”的主要内容,如果未能解决你的问题,请参考以下文章
SPARK 数据框错误:在使用 UDF 拆分列中的字符串时无法转换为 scala.Function2