unpickling 模型文件 python scikit-learn(管道(memory=None, steps=None, verbose=None))
Posted
技术标签:
【中文标题】unpickling 模型文件 python scikit-learn(管道(memory=None, steps=None, verbose=None))【英文标题】:unpickling model file python scikit-learn(Pipeline(memory=None, steps=None, verbose=None)) 【发布时间】:2020-05-20 08:09:11 【问题描述】:我正在尝试使用以下代码将泡菜文件从 Python 2
转换为 Python 3
:
import os
import dill
import pickle
import argparse
def convert(old_pkl):
"""
Convert a Python 2 pickle to Python 3
"""
# Make a name for the new pickle
new_pkl = os.path.splitext(os.path.basename(old_pkl))[0]+"_p3.pkl"
# Convert Python 2 "ObjectType" to Python 3 object
dill._dill._reverse_typemap["ObjectType"] = object
# Open the pickle using latin1 encoding
with open(old_pkl, "rb") as f:
loaded = pickle.load(f, encoding="bytes")
# Re-save as Python 3 pickle
with open(new_pkl, "wb") as outfile:
pickle.dump(loaded, outfile)
酸洗效果很好。但是,问题是当我尝试打印 Python3
腌制文件的输出而不是显示在下面时:
model = Pipeline([('count', CountVectorizer())
])
print(model)
Pipeline(memory=None,
steps=[('count_vectorizer', CountVectorizer(analyzer='word', binary=False, decode_error='strict',
dtype=<class 'numpy.int64'>, encoding='utf-8', input='content',
lowercase=True, max_df=1.0, max_features=None, min_df=1,
ngram_range=(1, 1), preprocessor=None, stop_words=None)])
如下图所示:
Pipeline(memory=None, steps=None, verbose=None)
【问题讨论】:
【参考方案1】:找到解决方案:
在解压文件时,我使用encoding
作为bytes
而不是latin1
。
使用 latin1 编码打开泡菜
with open(old_pkl, "rb") as f:
loaded = pickle.load(f, encoding="latin1")
一切正常。如需更好的说明,请参阅this
【讨论】:
以上是关于unpickling 模型文件 python scikit-learn(管道(memory=None, steps=None, verbose=None))的主要内容,如果未能解决你的问题,请参考以下文章
Python pickle/unpickle 到/从文件中提取列表
Unpickling 保存的 pytorch 模型会引发 AttributeError: Can't get attribute 'Net' on <module '__main__' 尽管内联
使用pickle保存机器学习模型详解及实战(picklejoblib)