提高 sklearn 中随机森林回归器的性能

Posted 2023-03-12

技术标签:

【中文标题】提高 sklearn 中随机森林回归器的性能【英文标题】：Increase performance of Random Forest Regressor in sklearn 【发布时间】：2020-03-12 16:41:19 【问题描述】：

有一个优化问题，我必须调用随机森林回归器的预测函数数千次。

from sklearn.ensemble import RandomForestRegressor
rfr = RandomForestRegressor(n_estimators=10)
rfr = rfr.fit(X, Y)
for iteration in range(0, 100000):
    # code that adapts the input data according to fitness of the last output
    output_data = rfr.predict(input_data)
    # code that evaluates the fitness of output data

在这种情况下，有没有办法提高预测函数的速度？可能通过使用 Cython？

【问题讨论】：

【参考方案1】：

您可以使用 SKompiler (https://github.com/konstantint/SKompiler) 将其转换为 C 或 C++ 代码，然后在那里运行。

from skompiler import skompile
expr = skompile(rfr.predict)
with open("output.cpp", "w") as text_file: print(expr.to('sympy/cxx'), file=text_file)

【讨论】：

以上是关于提高 sklearn 中随机森林回归器的性能的主要内容，如果未能解决你的问题，请参考以下文章

随机森林回归器的特征选择

随机森林回归 - 如何分析其性能？ - 蟒蛇，sklearn

随机森林回归严重过拟合单变量数据

详解随机森林-概述菜菜的sklearn课堂笔记

Sklearn 随机森林回归器出错

使用自定义目标/损失函数的随机森林回归器（Python/Sklearn）