awk 按照指定行名顺序提取数据

Question

我正在浏览我从PACKT购买的视频包以学习熊猫。作者使用jijna2 style（）突出显示每列中的最大值。我很快发现我不能在PyCharm中使用这种技术。所以我决定提取这些值。

我想要做的是通过从具有N列的数据帧中提取行索引，列名和最大列值来创建三列数据帧，然后创建新的数据帧。新数据框将显示每一行（如果有关系显示所有适当的行），列和该列中的最大值。

我创建了一个玩具数据框，只是为了完成代码的工作。

我的代码在下面，连同输出，在最底层，是我真正想要新数据帧的样子。

我知道我正在使用打印声明。到目前为止，该代码是我唯一能够正确选取多行的代码。

我抓住了整排，这是我不想要的。我也不确定如何从提取的数据构造建议的新数据帧。

import pandas as pd


raw_data = {
            'dogs': [42, 39, 86, 15, 23, 57, 68, 81, 86],
            'cats': [52, 41, 79, 80, 34, 47, 19, 22, 59],
            'sheep': [62, 37, 84, 51, 67, 32, 23, 89, 73],
            'lizards': [72, 43, 36, 26, 53, 88, 88, 34, 69],
            'birds': [82, 35, 77, 63, 18, 12, 45, 56, 58],
            }

df = pd.DataFrame(raw_data,
                  index=pd.Index(['row_1', 'row_2', 'row_3', 'row_4', 'row_5', 'row_6', 'row_7', 'row_8', 'row_9'], name='Rows'),
                  columns=pd.Index(['dogs', 'cats', 'sheep', 'lizards', 'birds'], name='animals'))

print(df)
print()

# Get a list of all columns names
cols = df.columns
print(cols)
print('*****')

for col in cols:
    print((df[df[col] == df[col].max()]))


'''
animals  dogs  cats  sheep  lizards  birds
Rows                                      
row_3      86    79     84       36     77
row_9      86    59     73       69     58
animals  dogs  cats  sheep  lizards  birds
Rows                                      
row_4      15    80     51       26     63
animals  dogs  cats  sheep  lizards  birds
Rows                                      
row_8      81    22     89       34     56
animals  dogs  cats  sheep  lizards  birds
Rows                                      
row_6      57    47     32       88     12
row_7      68    19     23       88     45
animals  dogs  cats  sheep  lizards  birds
Rows                                      
row_1      42    52     62       72     82
'''

row_3     dogs        86
row_9     dogs        86
row_4     cats        80
row_8     sheep       89
row_6     lizards     88
row_7     lizards     88
row_1     birds       82

Answer 1

另一答案

awk 按照指定行名顺序提取数据

使用行名，列名和最大列值创建数据框