合并循环中创建的数据(python)

Posted

技术标签:

【中文标题】合并循环中创建的数据(python)【英文标题】:Merge the data created in a loop (python) 【发布时间】:2021-06-07 22:11:14 【问题描述】:

我有一个简单的数据集:

import pandas as pd
data = [['A', 10,16], ['B', 15,11], ['C', 14,8]] 
df = pd.DataFrame(data, columns = ['Name', 'Apple','Pear']) 

Output
    Name Apple  Pear
0   A   10  16
1   B   15  11
2   C   14  8

我想对不同水果的数量进行排名 - 苹果和梨。规则:

    确定苹果和梨每个地方的区别 按位置排列差异。数量越近的两个地方排名越低
# apple
dif = abs(df['Apple'].values - df['Apple'].values[:, None])
df_apple  = pd.concat((df['Name'], pd.DataFrame(dif, columns = df['Name'])), axis=1)
df_apple1 = pd.melt(df_apple, id_vars = ['Name'], value_name='Difference_apple')
df_apple1 = df_apple1[df_apple1.Difference_apple != 0]
df_apple1['Ranking_apple'] = df_apple1.groupby('variable')['Difference_apple'].rank(method = 'dense', ascending = True)
df_apple1 = df_apple1[["variable","Name","Ranking_apple"]]
df_apple1

# Output - apple
    variable    Name    Ranking_apple
1   A   B   2.0
2   A   C   1.0
3   B   A   2.0
5   B   C   1.0
6   C   A   2.0
7   C   B   1.0
# pear
dif = abs(df['Pear'].values - df['Pear'].values[:, None])
df_pear  = pd.concat((df['Name'], pd.DataFrame(dif, columns = df['Name'])), axis=1)
df_pear1 = pd.melt(df_pear, id_vars = ['Name'], value_name='Difference_pear')
df_pear1 = df_pear1[df_pear1.Difference_pear != 0]
df_pear1['Ranking_pear'] = df_pear1.groupby('variable')['Difference_pear'].rank(method = 'dense', ascending = True)
df_pear1 = df_pear1[["variable","Name","Ranking_pear"]]
df_pear1

# output-pear
    variable    Name    Ranking_pear
1   A   B   1.0
2   A   C   2.0
3   B   A   2.0
5   B   C   1.0
6   C   A   2.0
7   C   B   1.0

这是每个水果的算法。因为我使用相同的逻辑,所以我可以为每个水果创建一个循环。 我不确定如何合并这两部分,因为我需要最终输出如下所示:

new_df = pd.merge(df_apple1, df_pear1,  how='inner', left_on=['variable','Name'], right_on = ['variable','Name'])

new_df = new_df[["variable","Name","Ranking_apple","Ranking_pear"]]

new_df

# output
variable    Name    Ranking_apple   Ranking_pear
0   A   B   2.0 1.0
1   A   C   1.0 2.0
2   B   A   2.0 2.0
3   B   C   1.0 1.0
4   C   A   2.0 2.0
5   C   B   1.0 1.0

我很欣赏任何想法。谢谢

【问题讨论】:

有什么问题?似乎您有预期的输出。你只是想概括一下吗? 是的,我想为多列使用一种算法。谢谢 太好了,希望答案能满足您的需要。 【参考方案1】:

如果您希望将您的方法推广到任意数量的水果,您可以执行以下操作:

data = [['A', 10,16], ['B', 15,11], ['C', 14,8]] 
df = pd.DataFrame(data, columns = ['Name', 'Apple','Pear']) 

# all fruit
final = pd.DataFrame()
fruitcols = df.columns.values.tolist()
fruitcols.remove('Name')
for col in fruitcols:
    dif = abs(df[col].values - df[col].values[:, None])
    diff_col = 'Difference_'.format(col)
    rank_col = 'Ranking_'.format(col)
    df_frt  = pd.concat((df['Name'], pd.DataFrame(dif, columns = df['Name'])), axis=1)
    df_frt1 = pd.melt(df_frt, id_vars = ['Name'], value_name=diff_col)

    df_frt1 = df_frt1[df_frt1[diff_col] != 0]
    df_frt1[rank_col] = df_frt1.groupby('variable')[diff_col].rank(method = 'dense', ascending = True)
    df_frt1 = df_frt1[["variable","Name",rank_col]]
    df_frt1
    final = pd.concat([final, df_frt1], axis=1)

final.loc[:,~final.columns.duplicated()]


    variable    Name    Ranking_Apple   Ranking_Pear
1   A           B       2.0             1.0
2   A           C       1.0             2.0
3   B           A       2.0             2.0
5   B           C       1.0             1.0
6   C           A       2.0             2.0
7   C           B       1.0             1.0

【讨论】:

以上是关于合并循环中创建的数据(python)的主要内容,如果未能解决你的问题,请参考以下文章

创建一个循环来子集数据并合并空间数据图以创建 gif

Google 表格 - 循环通过表格合并数据

利用Python将excel数据读取到word表格

如何在 Python 中合并 Spark SQL 数据帧

使用循环在python中合并面板数据[重复]

使用熊猫循环合并大量csv文件[重复]