Pandas-根据开关用数据框填充字典
Posted
技术标签:
【中文标题】Pandas-根据开关用数据框填充字典【英文标题】:Pandas- Fill a dictionary with dataframes depending on a switch 【发布时间】:2022-01-02 06:35:44 【问题描述】:背景:我有一些数据帧可以通过开关打开或关闭。我想用每个打开的数据框填充字典。然后我希望能够遍历数据框。
问题:我不知道如何动态构建我的字典以仅在打开开关时包含数据帧。
我的尝试:
import pandas as pd
sw_a = True
sw_b = False
sw_c = True
a = pd.DataFrame('IDs':[1234,5346,1234,8793,8793],
'Cost':[1.1,1.2,1.3,1.4,1.5],
'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue']) if sw_a == True else []
b = pd.DataFrame('IDs':[1,2],
'Cost':[1.1,1.2],
'Names':['APPLE1','Blue1']) if sw_b == True else []
c = pd.DataFrame('IDs':[12],
'Cost':[1.5],
'Names':['APPLE2']) if sw_c == True else []
total = "first":a,"second":b,"third":c
for df in total:
temp_cost = sum(total[df]['Cost'])
print(f'The number of fruits for df is len(total[df]) and the cost is temp_cost')
上述方法不起作用,因为它始终包含数据帧,如果开关关闭,它是一个字符串,而不是完全排除。
【问题讨论】:
【参考方案1】:我的设置与你的类似,但我不关心每个数据帧分配上的开关:
import pandas as pd
sw_a = True
sw_b = False
sw_c = True
a = pd.DataFrame('IDs':[1234,5346,1234,8793,8793],
'Cost':[1.1,1.2,1.3,1.4,1.5],
'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue'])
b = pd.DataFrame('IDs':[1,2],
'Cost':[1.1,1.2],
'Names':['APPLE1','Blue1'])
c = pd.DataFrame('IDs':[12],
'Cost':[1.5],
'Names':['APPLE2'])
total = "first":a,"second":b,"third":c # don't worry about the switches yet.
我们现在才过滤:
list_switches = [sw_a, sw_b, sw_c] # the switches! finally!
total_filtered = tup[1]:total[tup[1]] for tup in zip(list_switches, total) if tup[0]
照你做的继续。
for df in total_filtered:
temp_cost = sum(total[df]['Cost'])
print(f'The number of fruits for df is len(total[df]) and the cost is temp_cost')
输出:
编辑
您可以对zip
功能稍感兴趣,例如,如果您正在动态构建数据帧、数据帧名称和开关的列表,并且可以确保它们的长度始终相同,您可以执行以下操作:
# pretend these three lists are coming from somewhere else and can have variable length, rather than being hard-coded.
list_dfs = [a,b,c]
list_switches = [sw_a, sw_b, sw_c]
list_names = ["first", "second", "third"]
# use a zip object over the three lists.
zipped = zip(list_dfs, list_switches, list_names)
total = tup[2] : tup[0] for tup in zipped if tup[1]
for df in total:
temp_cost = sum(total[df]['Cost'])
print(f'The number of fruits for df is len(total[df]) and the cost is temp_cost')
【讨论】:
这很好用,但我不确定这条线是否有效... total = tup[2] : tup[0] for tup in zipped if tup[1] @JonathanHay - 这是对 zip 对象的 dict 理解。你熟悉这些概念吗?感谢您的支持和接受,顺便说一句。【参考方案2】:考虑这样的事情。
sw_a = True
sw_b = False
sw_c = True
a = pd.DataFrame('IDs':[1234,5346,1234,8793,8793],
'Cost':[1.1,1.2,1.3,1.4,1.5],
'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue'])
b = pd.DataFrame('IDs':[1,2],
'Cost':[1.1,1.2],
'Names':['APPLE1','Blue1'])
c = pd.DataFrame('IDs':[12],
'Cost':[1.5],
'Names':['APPLE2'])
total =
if sw_a == True:
total['sw_a'] = a
if sw_b == True:
total['sw_b'] = b
if sw_c == True:
total['sw_c'] = c
print(total)
for df in total:
temp_cost = sum(total[df]['Cost'])
print(f'The number of fruits for df is len(total[df]) and the cost is temp_cost')
The number of fruits for sw_a is 5 and the cost is 6.5
The number of fruits for sw_c is 1 and the cost is 1.5
【讨论】:
以上是关于Pandas-根据开关用数据框填充字典的主要内容,如果未能解决你的问题,请参考以下文章
如何访问 pandas 数据框列中的字典元素并对其进行迭代以创建填充有各自值的新列?