将特征重要性从随机森林导出到 csv

Posted 2023-03-12

技术标签:

【中文标题】将特征重要性从随机森林导出到 csv【英文标题】：exporting feature importance to csv from random forest 【发布时间】：2016-11-06 04:52:59 【问题描述】：

您好，我想创建一个包含 2 列的 .csv：随机森林模型的特征重要性和该特征的名称。并确保数值和变量名匹配正确

这是一个示例，但我无法正确导出到 .csv

test_features = test[["area","product", etc.]].values

# Create the target 
target = test["churn"].values

pred_forest = my_forest.predict(test_features)

# Print the score of the fitted random forest
print(my_forest.score(test_features, target))


importance = my_forest.feature_importances_


pd.DataFrame("IMP": importance, "features":test_features ).to_csv('forest_0407.csv',index=False)

【问题讨论】：

这是怎么失败的？这对我来说看起来有点可疑，因为您试图将特征重要性与特征 df 本身进行匹配，这是不正确的，因为特征重要性是列我很困惑，因为我打印“重要性”我只能看到一个数组，但我不确定哪个功能匹配，因为我想检查名称和值。消息错误是这样的：异常：数据必须是一维的试试这个功能test.columns.tolist()。 @shivsn 懒惰的打字员版本是list(df) 以获取列作为列表 @EdChum 很好，我不知道，谢谢。 【参考方案1】：

使用这个

x = list(zip(my_forest.feature_importances_,list of features you are using))
x = pandas.DataFrame(x,columns=["Importance","Feature_Name"])

【讨论】：

以上是关于将特征重要性从随机森林导出到 csv的主要内容，如果未能解决你的问题，请参考以下文章