TypeError：“MapResult”对象不可使用 pathos.multiprocessing 进行迭代

Posted 2023-02-16

技术标签:

【中文标题】TypeError：“MapResult”对象不可使用 pathos.multiprocessing 进行迭代【英文标题】：TypeError: 'MapResult' object is not iterable using pathos.multiprocessing 【发布时间】：2019-06-04 14:15:42 【问题描述】：

我正在对我拥有的数据集运行拼写校正功能。我使用from pathos.multiprocessing import ProcessingPool as Pool 来完成这项工作。处理完成后，我想实际访问结果。这是我的代码：

import codecs
import nltk

from textblob import TextBlob
from nltk.tokenize import sent_tokenize
from pathos.multiprocessing import ProcessingPool as Pool

class SpellCorrect():

    def load_data(self, path_1):
        with codecs.open(path_1, "r", "utf-8") as file:
            data = file.read()
        return sent_tokenize(data)

    def correct_spelling(self, data):
        data = TextBlob(data)
        return str(data.correct())

    def run_clean(self, path_1):
        pool = Pool()
        data = self.load_data(path_1)
        return pool.amap(self.correct_spelling, data)

if __name__ == "__main__":
    path_1 = "../Data/training_data/training_corpus.txt"
    SpellCorrect = SpellCorrect()
    result = SpellCorrect.run_clean(path_1)
    print(result)
    result = " ".join(temp for temp in result)
    with codecs.open("../Data/training_data/training_data_spell_corrected.txt", "a", "utf-8") as file:
        file.write(result)

如果您查看主块，当我执行print(result) 时，我会得到一个类型为<multiprocess.pool.MapResult object at 0x1a25519f28> 的对象。

我尝试使用result = " ".join(temp for temp in result) 访问结果，但随后出现以下错误TypeError: 'MapResult' object is not iterable。我尝试将其类型转换为列表list(result)，但仍然是同样的错误。我该怎么做才能解决这个问题？

【问题讨论】：

看起来你需要做result = SpellCorrect.run_clean(path_1).get()（注意.get()）。我猜a 的意思是“异步”，所以你可能需要先确保结果已经准备好。请参阅docs。感谢您的快速回复@Carcigenicate，我的意思是使用地图而不是地图（我的错）。无论如何，我按照建议使用了 .get()，现在出现以下错误：_pickle.PicklingError: Can't pickle main.SpellCorrect'>: it's not the same object作为 main.SpellCorrect 尝试将SpellCorrect = SpellCorrect() 更改为spellcorrect = SpellCorrect()。换句话说，消除类 (SpellCorrect) 与实例 (spellcorrect) 的歧义。然后您需要将result = SpellCorrect.run_clean(path_1) 更改为result = spellcorrect.run_clean(path_1)，因为它是调用方法的实例，而不是类。我是pathos 作者。正如@Carcigenicate 所说，使用map（或imap），而不是amap。如果您想要非阻塞不可迭代，请仅使用 amap。此外，这是来自@unutbu 的好建议，但我有一个缺点......pathos 可以存储对类的引用，因为它使用dill 进行序列化，而不是pickle，因此可以存储实际的类对象。 @MikeMcKerns：感谢您的更正。 【参考方案1】：

multiprocess.pool.MapResult 对象不可迭代，因为它继承自 AsyncResult，并且只有以下方法：

等待（[超时]） 等待结果可用或超时秒数过去。此方法始终返回 None。

ready()返回调用是否完成。

successful() 返回调用是否在没有引发例外。如果结果未准备好，将引发 AssertionError。

get([timeout]) 到达时返回结果。如果超时不是 None 并且结果在 timeout 秒内没有到达然后引发了 TimeoutError。如果远程调用引发异常，则该异常将被 get() 重新引发为 RemoteError。

您可以在此处查看如何使用 get() 函数的示例： https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers

from multiprocessing import Pool, TimeoutError
import time
import os

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)              # start 4 worker processes

    # print "[0, 1, 4,..., 81]"
    print pool.map(f, range(10))

    # print same numbers in arbitrary order
    for i in pool.imap_unordered(f, range(10)):
        print i

    # evaluate "f(20)" asynchronously
    res = pool.apply_async(f, (20,))      # runs in *only* one process
    print res.get(timeout=1)              # prints "400"

    # evaluate "os.getpid()" asynchronously
    res = pool.apply_async(os.getpid, ()) # runs in *only* one process
    print res.get(timeout=1)              # prints the PID of that process

    # launching multiple evaluations asynchronously *may* use more processes
    multiple_results = [pool.apply_async(os.getpid, ()) for i in range(4)]
    print [res.get(timeout=1) for res in multiple_results]

    # make a single worker sleep for 10 secs
    res = pool.apply_async(time.sleep, (10,))
    try:
        print res.get(timeout=1)
    except TimeoutError:
        print "We lacked patience and got a multiprocessing.TimeoutError"

【讨论】：

以上是关于TypeError：“MapResult”对象不可使用 pathos.multiprocessing 进行迭代的主要内容，如果未能解决你的问题，请参考以下文章