如何使用 reduce() 计算合并值?

Posted

技术标签:

【中文标题】如何使用 reduce() 计算合并值?【英文标题】:How to count merged values using reduce()? 【发布时间】:2020-09-14 08:21:17 【问题描述】:

我有一个包含四个数据框的列表,并希望根据公共列 ('id') 将它们全部合并到一个表中,然后计算这些合并的行。我想我可以通过使用 reduce()lambda 表达式来做到这一点:

number_of_rows = reduce(lambda a, b: a['id'].isin(b['id']).value_counts(), list_of_dataframes)

但它当然行不通。我的问题是我不知道如何编写一个代码,它将第一对 value_counts() 函数中的公共行数添加到下一对。提前致谢!

数据框列表:

list_of_df = [df_hops, df_method_mash_temp, df_ingredients_malt, whole_table]

每个数据框:

df_hops:

|    |   hops.name     | hops.add  | hops.attribute  | hops.amount.value  | hops.amount.unit  | id  |                name                 |
+----+-----------------+-----------+-----------------+--------------------+-------------------+-----+-------------------------------------+
| 0  | Fuggles         | start     | bitter          |             25.00  | grams             |  1  | Buzz                                |
| 1  | First Gold      | start     | bitter          |             25.00  | grams             |  1  | Buzz                                |
| 2  | Fuggles         | middle    | flavour         |             37.50  | grams             |  1  | Buzz                                |
| 3  | First Gold      | middle    | flavour         |             37.50  | grams             |  1  | Buzz                                |
| 4  | Cascade         | end       | flavour         |             37.50  | grams             |  1  | Buzz                                |
| 5  | Amarillo        | start     | bitter          |             13.80  | grams             |  2  | Trashy Blonde                       |
| 6  | Simcoe          | start     | bitter          |             13.80  | grams             |  2  | Trashy Blonde                       |
| 7  | Amarillo        | end       | flavour         |             26.30  | grams             |  2  | Trashy Blonde                       |
| 8  | Motueka         | end       | flavour         |             18.80  | grams             |  2  | Trashy Blonde                       |
| 9  | Bramling Cross  | middle    | bitter          |             10.00  | grams             |  3  | Berliner Weisse With Yuzu - B-Sides |
+----+-----------------+-----------+-----------------+--------------------+-------------------+-----+-------------------------------------+

df_method_mash_temp:

|    | method.mash_temp.duration  | method.mash_temp.temp.value  | method.mash_temp.temp.unit  | id  |                name                 |
+----+----------------------------+------------------------------+-----------------------------+-----+-------------------------------------+
| 0  | 75.0                       |                        64.0  | celsius                     |  1  | Buzz                                |
| 1  | NaN                        |                        69.0  | celsius                     |  2  | Trashy Blonde                       |
| 2  | 10.0                       |                        60.0  | celsius                     |  3  | Berliner Weisse With Yuzu - B-Sides |
| 3  | 30.0                       |                        65.0  | celsius                     |  3  | Berliner Weisse With Yuzu - B-Sides |
+----+----------------------------+------------------------------+-----------------------------+-----+-------------------------------------+

df_ingredients_malt:

|    | ingredients.malt.name   | ingredients.malt.amount.value  | ingredients.malt.amount.unit  | id  |                name                 |
+----+-------------------------+--------------------------------+-------------------------------+-----+-------------------------------------+
| 0  | Maris Otter Extra Pale  |                       3.30000  | kilograms                     |  1  | Buzz                                |
| 1  | Caramalt                |                       0.20000  | kilograms                     |  1  | Buzz                                |
| 2  | Munich                  |                       0.40000  | kilograms                     |  1  | Buzz                                |
| 3  | Maris Otter Extra Pale  |                       3.25000  | kilograms                     |  2  | Trashy Blonde                       |
| 4  | Caramalt                |                       0.20000  | kilograms                     |  2  | Trashy Blonde                       |
| 5  | Munich                  |                       0.40000  | kilograms                     |  2  | Trashy Blonde                       |
| 6  | Propino Pale Malt       |                       1.63000  | kilograms                     |  3  | Berliner Weisse With Yuzu - B-Sides |
+----+-------------------------+--------------------------------+-------------------------------+-----+-------------------------------------+

whole_table(这是上面其他三个来自的主表):

|    | id  |                name                  |             tagline               | first_brewed  |                                                                                                                                                                                                                          description                                                                                                                                                                                                                           |               image_url                | abv   | ibu   | target_fg  | target_og  | ebc   |  srm   | ph   | attenuation_level  |                                                    food_pairing                                                      |                                                                                      brewers_tips                                                                                       |     contributed_by      | volume.value  | volume.unit  | boil_volume.value  | boil_volume.unit  |                                                                                                                method.mash_temp                                                                                                                  | method.fermentation.temp.value  | method.fermentation.temp.unit  |                                              method.twist                                                |                                                                                                                                                                     ingredients.malt                                                                                                                                                                       |                                                                                                                                                                                                                                                               ingredients.hops                                                                                                                                                                                                                                                                 |      ingredients.yeast      |
+----+-----+--------------------------------------+-----------------------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------+-------+-------+------------+------------+-------+--------+------+--------------------+----------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+---------------+--------------+--------------------+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+--------------------------------+----------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------+
| 0  |  1  | Buzz                                 | A Real Bitter Experience.         | 09/2007       | A light, crisp and bitter IPA brewed with English and American hops. A small batch brewed only once.                                                                                                                                                                                                                                                                                                                                                           | https://images.punkapi.com/v2/keg.png  | 4.50  | 60.0  |    1010.0  |    1044.0  | 20.0  | 10.00  | 4.4  |             75.00  | [Spicy chicken tikka masala, Grilled chicken quesadilla, Caramel toffee cake]                                        | The earthy and floral aromas from the hops can be overpowering. Drop a little Cascade in at the end of the boil to lift the profile with a bit of citrus.                               | Sam Mason <samjbmason>  |           20  | litres       |                25  | litres            | ['temp': 'value': 64, 'unit': 'celsius', 'duration': 75]                                                                                                                                                                                     |                           19.0  | celsius                        | None                                                                                                     | ['name': 'Maris Otter Extra Pale', 'amount': 'value': 3.3, 'unit': 'kilograms', 'name': 'Caramalt', 'amount': 'value': 0.2, 'unit': 'kilograms', 'name': 'Munich', 'amount': 'value': 0.4, 'unit': 'kilograms']                                                                                                                                | ['name': 'Fuggles', 'amount': 'value': 25, 'unit': 'grams', 'add': 'start', 'attribute': 'bitter', 'name': 'First Gold', 'amount': 'value': 25, 'unit': 'grams', 'add': 'start', 'attribute': 'bitter', 'name': 'Fuggles', 'amount': 'value': 37.5, 'unit': 'grams', 'add': 'middle', 'attribute': 'flavour', 'name': 'First Gold', 'amount': 'value': 37.5, 'unit': 'grams', 'add': 'middle', 'attribute': 'flavour', 'name': 'Cascade', 'amount': 'value': 37.5, 'unit': 'grams', 'add': 'end', 'attribute': 'flavour']  | Wyeast 1056 - American Ale™ |
| 1  |  2  | Trashy Blonde                        | You Know You Shouldn't            | 04/2008       | A titillating, neurotic, peroxide punk of a Pale Ale. Combining attitude, style, substance, and a little bit of low self esteem for good measure; what would your mother say? The seductive lure of the sassy passion fruit hop proves too much to resist. All that is even before we get onto the fact that there are no additives, preservatives, pasteurization or strings attached. All wrapped up with the customary BrewDog bite and imaginative twist.  | https://images.punkapi.com/v2/2.png    | 4.10  | 41.5  |    1010.0  |    1041.7  | 15.0  | 15.00  | 4.4  |             76.00  | [Fresh crab with lemon, Garlic butter dipping sauce, Goats cheese salad, Creamy lemon bar doused in powdered sugar]  | Be careful not to collect too much wort from the mash. Once the sugars are all washed out there are some very unpleasant grainy tasting compounds that can be extracted into the wort.  | Sam Mason <samjbmason>  |           20  | litres       |                25  | litres            | ['temp': 'value': 69, 'unit': 'celsius', 'duration': None]                                                                                                                                                                                   |                           18.0  | celsius                        | None                                                                                                     | ['name': 'Maris Otter Extra Pale', 'amount': 'value': 3.25, 'unit': 'kilograms', 'name': 'Caramalt', 'amount': 'value': 0.2, 'unit': 'kilograms', 'name': 'Munich', 'amount': 'value': 0.4, 'unit': 'kilograms']                                                                                                                               | ['name': 'Amarillo', 'amount': 'value': 13.8, 'unit': 'grams', 'add': 'start', 'attribute': 'bitter', 'name': 'Simcoe', 'amount': 'value': 13.8, 'unit': 'grams', 'add': 'start', 'attribute': 'bitter', 'name': 'Amarillo', 'amount': 'value': 26.3, 'unit': 'grams', 'add': 'end', 'attribute': 'flavour', 'name': 'Motueka', 'amount': 'value': 18.8, 'unit': 'grams', 'add': 'end', 'attribute': 'flavour']                                                                                                                | Wyeast 1056 - American Ale™ |
| 2  |  3  | Berliner Weisse With Yuzu - B-Sides  | Japanese Citrus Berliner Weisse.  | 11/2015       | Japanese citrus fruit intensifies the sour nature of this German classic.                                                                                                                                                                                                                                                                                                                                                                                      | https://images.punkapi.com/v2/keg.png  | 4.20  |  8.0  |    1007.0  |    1040.0  |  8.0  |  4.00  | 3.2  |             83.00  | [Smoked chicken wings, Miso ramen, Yuzu cheesecake]                                                                  | Clean everything twice. All you want is the clean sourness of lactobacillus.                                                                                                            | Sam Mason <samjbmason>  |           20  | litres       |                25  | litres            | ['temp': 'value': 60, 'unit': 'celsius', 'duration': 10, 'temp': 'value': 65, 'unit': 'celsius', 'duration': 30, 'temp': 'value': 72, 'unit': 'celsius', 'duration': 10, 'temp': 'value': 78, 'unit': 'celsius', 'duration': 5]  |                           21.0  | celsius                        | Soured naturally using the kettle souring technique, Yuzu fruit: 50g at middle, Yuzu juice: 200ml at FV  | ['name': 'Propino Pale Malt', 'amount': 'value': 1.63, 'unit': 'kilograms', 'name': 'Wheat Malt', 'amount': 'value': 1.63, 'unit': 'kilograms', 'name': 'Propino Pale Malt for kettle souring', 'amount': 'value': 0.03, 'unit': 'kilograms', 'name': 'Acidulated Malt for kettle souring', 'amount': 'value': 0.03, 'unit': 'kilograms']  | ['name': 'Bramling Cross', 'amount': 'value': 10, 'unit': 'grams', 'add': 'middle', 'attribute': 'bitter']                                                                                                                                                                                                                                                                                                                                                                                                                                 | Wyeast 1056 - American Ale™ |
+----+-----+--------------------------------------+-----------------------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------+-------+-------+------------+------------+-------+--------+------+--------------------+----------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+---------------+--------------+--------------------+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+--------------------------------+----------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------+

@Anshul:

list_of_df = [df_hops,df_method_mash_temp,df_ingredients_malt,whole_table]

merged_count = 0

def merge_dfs(df_a, df_b):
    df = df_a.merge(df_b, on=['id', 'name'], how='outer')

    global merged_count
    merged_count += df_a.shape[0] + df_b.shape[0] - df.shape[0]
    return df

# use reduce function to get the final merged dataframe
df_final = reduce(merge_dfs, list_of_df)

print(merged_count)

我的其余代码是来自 API 的请求和为了创建您在上面看到的表而进行的简单规范化。我认为不需要在此问题的上下文中粘贴此处,但如果您希望我将其张贴在这里,只需写下,我会这样做。

【问题讨论】:

请分享您的数据的小样本,以及您的预期输出 使您的累加器成为一个元组:(List[pd.DataFrame], int),初始值为(init_df_list, 0),并使您的单曲面运算成为一个合并和递增的 lambda。 你能用更简单的方式解释一下吗?如果您根据我的表格告诉我更多信息,那就太好了。 【参考方案1】:

我很好奇,想试试这个。我创建了自己的一组示例数据框,使用“reduce”实现了合并数据框,使用“全局”变量来获取合并的行总数。这就是我所做的,它对我有用:

import pandas as pd
from functools import reduce

data1 = 
"Id": [1,2,3,4,5], 
"Data": ['a', 'b', 'c', 'd', 'e']


data2 = 
"Id": [2,3,4,5,6], 
"Data": ['b', 'c', 'd', 'e', 'f']


data3 = 
"Id": [3,4,5,6,7], 
"Data": ['c', 'd', 'e', 'f', 'g']


data4 = 
"Id": [4,5,6,7,8], 
"Data": ['d', 'e', 'f', 'g', 'h']


df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df3 = pd.DataFrame(data3)
df4 = pd.DataFrame(data4)

数据框:

df1:

  Data  Id
0    a   1
1    b   2
2    c   3
3    d   4
4    e   5

df2:

  Data  Id
0    b   2
1    c   3
2    d   4
3    e   5
4    f   6

df3:

  Data  Id
0    c   3
1    d   4
2    e   5
3    f   6
4    g   7

df4:

  Data  Id
0    d   4
1    e   5
2    f   6
3    g   7
4    h   8

.

# define a counter variable to be used as a global
merged_count = 0

# define the function to be called via reduce
def merge_dfs(df_a, df_b):
    df = df_a.merge(df_b, on=['Id', 'Data'], how='outer')

    global merged_count
    merged_count += df_a.shape[0] + df_b.shape[0] - df.shape[0]
    return df

# use reduce function to get the final merged dataframe
df_final = reduce(merge_dfs, [df1, df2, df3, df4])

print(df_final)
print(merged_count)

df_final:

  Data  Id
0    a   1
1    b   2
2    c   3
3    d   4
4    e   5
5    f   6
6    g   7
7    h   8

合并的行数:

12

如果这对您有帮助,您可以尝试在您的场景中实现它。 如果有更好的方法来实现这一点,我真的很想知道并在需要时改进这个解决方案。

【讨论】:

谢谢你的努力!我粘贴了我的表格,以便更容易理解我的意思。 我建议的方法应该可行。您将不得不根据需要更改函数 merge_dfs(concat 或 join 等)。 IIUC 的主要问题是如何获得合并行的计数,我想我的解决方案已经表明了这一点。我在这里错过了什么吗? 好的,所以我尝试了你的方法和函数 print(merged_count) 给了我 -3962 所以出了点问题 当然。啊,我忘了告诉你,我添加到我的帖子中的表格只是一个示例。每个数据框都有很多行,但我认为现在没关系。输出可能是 3962 但当然不是 -3962。 我想我找到了问题所在。问题是您尝试合并数据帧的键不是至少一个数据帧中的主键。也就是说,合并正在进入笛卡尔积并最终得到具有更多行的结果数据框。

以上是关于如何使用 reduce() 计算合并值?的主要内容,如果未能解决你的问题,请参考以下文章

js:Array.reduce累加计算合并数组

如何使用给定的reduce函数基于pyspark中的字段合并多个JSON数据行

如果可能的话,Java如何使用reduce函数将两个数组减少/合并为一个?

为啥不在 Stream API 中使用 reduce 来合并多个映射?

如何使用pythons内置map和reduce函数计算字符串中的字母频率

如何使用 Map Reduce 改进聚合处理时间?