python 通过随机森林模型计算feature_importances

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python 通过随机森林模型计算feature_importances相关的知识,希望对你有一定的参考价值。

# defining rmse as scoring criteria (any other criteria can be used in a similar manner)

def score(x1,x2):
    return metrics.mean_squared_error(x1,x2)
# defining feature importance function based on above logic
def feat_imp(m, x, y, small_good = True): 
"""
m: random forest model
x: matrix of independent variables
y: output variable
small__good: True if smaller prediction score is better
"""  
     score_list = {} 
      score_list[‘original’] = score(m.predict(x.values), y) 
      imp = {} 
      for i in range(len(x.columns)): 
            rand_idx = np.random.permutation(len(x)) # randomization
            new_coli = x.values[rand_idx, i] 
            new_x = x.copy()            
            new_x[x.columns[i]] = new_coli 
            score_list[x.columns[i]] = score(m.predict(new_x.values), y) 
            imp[x.columns[i]] = score_list[‘original’] — score_list[x.columns[i]] # comparison with benchmark
       if small_good: 
             return sorted(imp.items(), key=lambda x: x[1]) 
       else: return sorted(imp.items(), key=lambda x: x[1], reverse=True)

以上是关于python 通过随机森林模型计算feature_importances的主要内容,如果未能解决你的问题,请参考以下文章