在多处理功能中使用管理器(用于池)(Windows 10)

Posted

技术标签:

【中文标题】在多处理功能中使用管理器(用于池)(Windows 10)【英文标题】:Using the Manager (for Pool) in the function for multiprocessing (Windows 10) 【发布时间】:2021-01-17 15:01:17 【问题描述】:

我正在从多处理学习池、管理器等。我想在我的函数中使用 Manager 中的命名空间。我从 Internet 上删除了一些代码,这些代码突出了 Windows 中多处理管理器的问题。这里是:

"""How to share data in multiprocessing with Manager.Namespace()"""
from multiprocessing import Pool, Manager

import numpy as np


# Create manager object in module-level namespace
mgr = Manager()
# Then create a container of things that you want to share to
# processes as Manager.Namespace() object.
config = mgr.Namespace()
# The Namespace object can take various data type
config.a = 1
config.b = '2'
config.c = [1, 2, 3, 4]


def func(i):
    """This is a function that we want our processes to call."""
    # You can modify the Namespace object from anywhere.
    config.z = i
    print('config is', config)
    # And they will still be shared (i.e. same id).
    print('id(config) = :d'.format(id(config)))


# This main func
def main():
    """The main function contain multiprocess.Pool codes."""
    # You can add to the Namespace object too.
    config.d = 10
    config.a = 5.25e6
    pool = Pool(1)
    pool.map(func, (range(20, 25)))
    pool.close()
    pool.join()


if __name__ == "__main__":
    # Let's print the config
    print(config)
    # Now executing main()
    main()
    # Again, you can add or modify the Namesapce object from anywhere.
    config.e = np.round(np.random.rand(2,2), 2)
    config.f = range(-3, 3)
    print(config)

错误如下:

An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:

    if __name__ == '__main__':
        freeze_support()
        ...

The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.

我认为,问题在于管理器使用了一个全局变量。你不能用 Windows 做到这一点。正如你所看到的,我正在守卫主队,但这还不够。需要做的是以某种方式将管理器传递给函数(可能是映射变量),但我不知道该怎么做。

【问题讨论】:

【参考方案1】:

是的,看起来将 Manager 创建为全局会导致 Windows 出现问题。将其移至模块 main 并将命名空间作为参数传递。 Pool.map() 只允许将一个参数传递给工作人员,因此将多个参数(包括命名空间)放入一个列表中。将参数列表的列表传递给 Pool.map()。

我可能错了,但我认为您不应该期望/要求对象 ID 不改变。

from multiprocessing import Pool, Manager

import numpy as np


def func(a):
    """This is a function that we want our processes to call."""
    (config, i) = a
    # You can modify the Namespace object from anywhere.
    config.z = i
    print('config is', config)
    # And they will still be shared (i.e. same id).
    print('id(config) = :d'.format(id(config)))


# This main func
def main(config):
    """The main function contain multiprocess.Pool codes."""
    # You can add to the Namespace object too.
    config.d = 10
    config.a = 5.25e6
    pool = Pool(1)
    pool.map(func, list([config, i] for i in range(20,25)))
    pool.close()
    pool.join()


if __name__ == "__main__":
    # Create manager object in module-level namespace
    mgr = Manager()
    # Then create a container of things that you want to share to
    # processes as Manager.Namespace() object.
    config = mgr.Namespace()
    # The Namespace object can take various data type
    config.a = 1
    config.b = '2'
    config.c = [1, 2, 3, 4]

    # Let's print the config
    print(config)
    # Now executing main()
    main(config)
    # Again, you can add or modify the Namesapce object from anywhere.
    config.e = np.round(np.random.rand(2,2), 2)
    config.f = range(-3, 3)
    print(config)

【讨论】:

嗨,达伦。谢谢回答。我试图从我的朋友那里得到帮助,但他们不能被打扰。我会让他们知道我必须从陌生人那里得到帮助。保持安全。

以上是关于在多处理功能中使用管理器(用于池)(Windows 10)的主要内容,如果未能解决你的问题,请参考以下文章

在多处理池中修改外部字典

线程池

线程池

Java线程池源码分析

Debezium 嵌入式数据库连接管理/池

Android中常见的线程池