拯救随机森林

Posted

技术标签:

【中文标题】拯救随机森林【英文标题】:Saving Random Forest 【发布时间】:2015-02-15 23:06:37 【问题描述】:

我想保存并加载一个合适的随机森林分类器,但我得到一个错误。

forest = RandomForestClassifier(n_estimators = 100, max_features = mf_val)
forest = forest.fit(L1[0:100], L2[0:100])
joblib.dump(forest, 'screening_forest/screening_forest.pkl')
forest2 = joblib.load('screening_forest/screening_forest.pkl')

错误是:

  File "C:\Users\mkolarek\Documents\other\TrackerResultAnalysis\ScreeningClassif
ier\ScreeningClassifier.py", line 67, in <module>
    forest2 = joblib.load('screening_forest/screening_forest.pkl')
  File "C:\Python27\lib\site-packages\sklearn\externals\joblib\numpy_pickle.py",
 line 425, in load
    obj = unpickler.load()
  File "C:\Python27\lib\pickle.py", line 858, in load
    dispatch[key](self)
  File "C:\Python27\lib\site-packages\sklearn\externals\joblib\numpy_pickle.py",
 line 285, in load_build
    Unpickler.load_build(self)
  File "C:\Python27\lib\pickle.py", line 1217, in load_build
    setstate(state)
  File "_tree.pyx", line 2280, in sklearn.tree._tree.Tree.__setstate__ (sklearn\
tree\_tree.c:18350)
ValueError: Did not recognise loaded array layout
Press any key to continue . . .

我必须初始化 forest2 吗?

【问题讨论】:

保存随机森林! :) 【参考方案1】:

我用 cPickle 解决了这个问题:

with open('screening_forest/screening_forest.pickle', 'wb') as f:
    cPickle.dump(forest, f)

with open('screening_forest/screening_forest.pickle', 'rb') as f:
    forest2 = cPickle.load(f)

但 joblib 解决方案也可能有用。

【讨论】:

试过了,在我的情况下,它在机器之间不起作用:( 与 pickle 和 cPickle 得到完全相同的错误【参考方案2】:

你可以试试这个方法

model = RandomForestClassifier()

model.fit(data,lables)

import pickle

Model_file = 'model.pkl'

pickle.dump(model, open(Model_file, 'wb'))

'''Reloading the model
load the model from Saved file'''

loaded_model = pickle.load(open(Model_file, 'rb'))

【讨论】:

以上是关于拯救随机森林的主要内容,如果未能解决你的问题,请参考以下文章

随机森林原理

随机森林

特征筛选(随机森林)

RandomForest随机森林算法

分类算法 - 随机森林

15、随机森林的OOB