如何根据 Pandas 中的列表过滤 DataFrame 中的项目?
Posted
技术标签:
【中文标题】如何根据 Pandas 中的列表过滤 DataFrame 中的项目?【英文标题】:How to filter the items in a DataFrame based on a list in pandas? 【发布时间】:2021-01-14 18:41:34 【问题描述】:我是编码新手,正在尝试处理以下数据:
df=
Position
A/C MECHANIC
A/C TECHNICIAN
A/C TECHNICIAN HELPER
ACCOUNTANT
ACCOUNTANT MANAGER
ACCOUNTING CLERK
ACCOUNTS AUDITOR
ACCOUNTS MANAGER
ACCOUNTS SUPERVISOR
ACTING HOSPITAL ADMINISTRATOR
ADMINISTRATION SECRETARY
ADMINISTRATIVE SUPERVISOR
ADMINISTRATIVE CLERK
ADMINISTRATIVE COORDINATOR
ADMINISTRATIVE DIRECTOR
ADMINISTRATIVE MANAGER
ADMINISTRATOR OF MED.INSURANCE
ADMINSTRATION OFFICE MANAGER
ADMISSION COUNTER CLERK
ADMISSION OFFICER
我有以下清单:
name=['TECHNICIAN', 'MANAGER', 'CLERK', 'AUDITOR', 'SUPERVISOR', 'SECRETARY', 'COORDINATOR', 'DIRECTOR', 'OFFICER', 'SPECIALIST', 'PROGRAMMER', 'TYPIST', 'LIASON', 'DESIGNER', 'ENGINEER', 'ACCOUNTANT', 'ADMINISTRATOR', 'BAKER', 'COOK']
我正在尝试创建一个新的数据框,它从上述列表中获取值,找到包含该单词的相应位置,然后将其添加到新数据框中的列中。
这是我正在使用的代码。
newdf=pd.DataFrame()
for i in name:
print(i)
newdf[i]=df[df['position'].str.contains(i)]
我正在尝试将每个过滤后的值添加到“newdf”中的新列中。
当我运行上面的代码时,我收到了这个错误:
ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series
我正在尝试获得以下输出:
TECHNICIAN, MANAGER,
A/C TECHNICIAN ACCOUNTANT MANAGER
ALUMINUM TECHNICIAN ACCOUNTS MANAGER
ANAESTHESIA TECHNICIAN ADMINISTRATIVE MANAGER
APPLIANCE TECHNICIAN
BIOMEDICAL SENIOR
BIOMEDICAL TECHNICIAN
BOILER TECHNICIAN
COMPUTER TECHNICIAN
COMPUTER TECHNICIAN
COMPUTER TECHNICIAN
【问题讨论】:
请添加预期输出。 @HenryYik 我添加了预期的输出。谢谢你告诉我。 【参考方案1】:创建DataFrame
s 的字典并传递给concat
:
dfs = i: df.loc[df['Position'].str.contains(i), 'Position'].reset_index(drop=True)
for i in name
newdf = pd.concat(dfs, axis=1)
print (newdf)
TECHNICIAN MANAGER \
0 A/C TECHNICIAN ACCOUNTANT MANAGER
1 A/C TECHNICIAN HELPER ACCOUNTS MANAGER
2 NaN ADMINISTRATIVE MANAGER
3 NaN ADMINSTRATION OFFICE MANAGER
CLERK AUDITOR \
0 ACCOUNTING CLERK ACCOUNTS AUDITOR
1 ADMINISTRATIVE CLERK NaN
2 ADMISSION COUNTER CLERK NaN
3 NaN NaN
SUPERVISOR SECRETARY \
0 ACCOUNTS SUPERVISOR ADMINISTRATION SECRETARY
1 ADMINISTRATIVE SUPERVISOR NaN
2 NaN NaN
3 NaN NaN
COORDINATOR DIRECTOR \
0 ADMINISTRATIVE COORDINATOR ADMINISTRATIVE DIRECTOR
1 NaN NaN
2 NaN NaN
3 NaN NaN
OFFICER SPECIALIST PROGRAMMER TYPIST LIASON DESIGNER \
0 ADMISSION OFFICER NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN
ENGINEER ACCOUNTANT ADMINISTRATOR BAKER \
0 NaN ACCOUNTANT ACTING HOSPITAL ADMINISTRATOR NaN
1 NaN ACCOUNTANT MANAGER ADMINISTRATOR OF MED.INSURANCE NaN
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
COOK
0 NaN
1 NaN
2 NaN
3 NaN
newdf.to_csv('file.csv', index=False)
【讨论】:
@jazrael 当我使用它时,它不会过滤任何东西,而是再次给我“df”。 我刚刚用我想要得到的输出更新了我的问题 我尝试了这两个代码不幸的是我得到了“df”作为我的输出 这个作品的伙伴。但是您可以帮助获取新的 DataFrame,而不是创建新列表。 太棒了.. 谢谢 aLOOOTT以上是关于如何根据 Pandas 中的列表过滤 DataFrame 中的项目?的主要内容,如果未能解决你的问题,请参考以下文章
如何使用 python 或 pandas 根据包含字典列表的列过滤 DataFrame?
Pandas:如何从给定(行,列)对列表的 DataFrame 中检索值?
如何根据列表有条件地更新 Pandas 中的 DataFrame 列