pickle.PicklingError:无法腌制未打开读取的文件
Posted
技术标签:
【中文标题】pickle.PicklingError:无法腌制未打开读取的文件【英文标题】:pickle.PicklingError: Cannot pickle files that are not opened for reading 【发布时间】:2017-05-15 10:34:38 【问题描述】:我在 Dataproc 上运行 PySpark 作业时遇到此错误。可能是什么原因?
这是错误的堆栈跟踪。
File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 553, in save_reduce
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 582, in save_file
pickle.PicklingError: Cannot pickle files that are not opened for reading
【问题讨论】:
我在 pyspark 中遇到了同样的问题。 @ramanand 你找到解决办法了吗? 是的,我正在阅读地图功能中的字典,但我还没有广播。所以原因是工作节点找不到该字典并引发 pickle 异常。 【参考方案1】:问题是我在 Map 函数中使用了字典。 失败的原因:工作节点无法访问我在 map 函数中传递的字典。
解决方案:
I broadcasted the dictionary and then used it in function (Map)
sc = SparkContext()
lookup_bc = sc.broadcast(lookup_dict)
然后在函数中,我通过使用这个来获取价值:
data = lookup_bc.value.get(key)
【讨论】:
以上是关于pickle.PicklingError:无法腌制未打开读取的文件的主要内容,如果未能解决你的问题,请参考以下文章
带有joblib库的spacy生成_pickle.PicklingError:无法腌制任务以将其发送给工作人员
pickle.PicklingError: Can't pickle: it's not the same object as
自定义 sklearn 管道变压器给出“pickle.PicklingError”
关于tcp连接对象在多进程中的错误:pickle.PicklingError
尝试从 BigQuery 读取表并使用 Airflow 将其保存为数据框时出现 _pickle.PicklingError
pickle.PicklingError: Can't pickle <function past_match_sim at 0x7fa26e03b7b8>: attribute look