如何在 scikit-learn 中保存随机森林？

Posted 2023-03-12

技术标签:

【中文标题】如何在 scikit-learn 中保存随机森林？【英文标题】：How to save a randomforest in scikit-learn？ 【发布时间】：2015-02-20 03:45:16 【问题描述】：

实际上有很多关于持久性的问题，但我已经尝试了很多使用 pickle 或 joblib.dumps 。但是当我用它来保存我的随机森林时，我得到了这个：

ValueError: ("Buffer dtype mismatch, expected 'SIZE_t' but got 'long'", <type 'sklearn.tree._tree.ClassificationCriterion'>, (1, array([10])))

谁能告诉我为什么？

一些代码供审查

forest = RandomForestClassifier()
forest.fit(data[:n_samples], target[:n_samples ])
import cPickle
with open('rf.pkl', 'wb') as f:
    cPickle.dump(forest, f)
with open('rf.pkl', 'rb') as f:
    forest = cPickle.load(f)

或

from sklearn.externals import joblib
joblib.dump(forest,'rf.pkl') 

from sklearn.externals import joblib
forest = joblib.load('rf.pkl')

【问题讨论】：

请发布一些示例代码。两种解决方案都给出同样的错误？您是否使用相同的 32/64 位 python 来保存/加载？ ***.com/questions/21033038/… 哦，我忘了我用的不是同一个位。谢谢！ 【参考方案1】：

这是由使用不同的 32/64 位版本的 python 来保存/加载引起的，正如Scikits-Learn RandomForrest trained on 64bit python wont open on 32bit python 建议的那样。

【讨论】：

【参考方案2】：

尝试直接导入joblib包：

import joblib

# ...

# save
joblib.dump(rf, "some_path")

# load 
rf2 = joblib.load("some_path")

我已将完整的工作示例与代码和 cmets here 放在一起。

【讨论】：

以上是关于如何在 scikit-learn 中保存随机森林？的主要内容，如果未能解决你的问题，请参考以下文章