在python中使用相关矩阵后获取列

Posted 2023-03-12

技术标签:

【中文标题】在python中使用相关矩阵后获取列【英文标题】：Get column when after using correlation matrix in python 【发布时间】：2021-09-15 02:08:25 【问题描述】：

这是我在 python 中使用相关矩阵的代码：

import numpy as np
from sklearn.datasets import load_breast_cancer
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = 
    's_name_supplier' : ['Tiki Trading', 'Tiki Trading', 'Phụ Kiện Store 76'], 
    's_is_best_price_guaranteed' : [1, 1, 0],
    's_quantity_sold' : [22213, 5801, 8],
    's_rating_average' : [4.7, 5, 4.8],
    's_price_of_product' : [308000, 230000, 190000],
    's_review_count_product' : [3041, 1540, 63],
    's_repaid' : ['111', '111', '111'],
    's_avg_rating_point' : [4.61, 4.49, 4.74],
    's_review_count_supplier' : [2846008, 2846008, 75028],
    's_total_follower' : [225730, 225730, 345],
    's_year_participate' : [2017, 2017, 2020],
    's_product_cancell_rate' : ['0', '0', '0'],
    's_product_return_rate' : ['0', '0', '0'],
    's_total_product_seller_detail' : [134302, 134302, 70000],
    's_certificate' : ['None', 'None', 'None'],
    's_complaint' : [2.0, 0.0, 2.8],
    's_ppm_rate' : [0.0, 40.0, 0.0]


features = ['s_name_supplier', 's_is_best_price_guaranteed', 
            's_quantity_sold', 's_rating_average', 's_price_of_product', 
            's_review_count_product', 's_repaid', 's_avg_rating_point', 
            's_review_count_supplier', 's_total_follower', 's_year_participate', 
            's_product_cancell_rate', 's_product_return_rate', 's_certificate', 
            's_complaint', 's_ppm_rate'
           ]

df = pd.DataFrame(data, columns = features)
correlation_mat = df.corr()

sns.heatmap(correlation_mat, annot = True)

plt.title("Correlation matrix of Breast Cancer data", y=-1)

plt.xlabel("cell nucleus features")

plt.ylabel("cell nucleus features")

plt.show()

然后，输出这段代码：

预期：我想通过 Python 获取带有图表名称列的数组：

示例：

['s_is_best_price_guaranteed', 's_quantity_sold', 's_rating_average', 
's_price_of_product', 's_review_count_product', 's_avg_rating_point', 
's_review_count_supplier', 's_total_follower', 's_year_participate', 
's_complaint', 's_ppm_rate']

我在努力

array_column = sorted_pairs[abs(sorted_pairs) > 0.5]
print(array_column)

但没有成功

请帮我解决这个问题。非常感谢，爱你！！！！

【问题讨论】：

【参考方案1】：

correlation_mat[correlation_mat>0.5].index.values

【讨论】：

以上是关于在python中使用相关矩阵后获取列的主要内容，如果未能解决你的问题，请参考以下文章