Pytorch-Lightning 是不是具有多处理（或 Joblib）模块？

Posted 2023-03-27

技术标签:

【中文标题】Pytorch-Lightning 是不是具有多处理（或 Joblib）模块？【英文标题】：Does Pytorch-Lightning have a multiprocessing (or Joblib) module?Pytorch-Lightning 是否具有多处理（或 Joblib）模块？ 【发布时间】：2020-11-16 09:19:09 【问题描述】：

我一直在谷歌搜索，但似乎无法找到 Pytorch-Lightning 中是否有 multiprocessing 模块，就像 Pytorch 有一个 torch.multiprocessing 模块一样。

有谁知道 Pytorch-Lightning 是否有这个（或 Joblib 类似的）模块？我正在寻找一个 Pytorch-Lightning 模块，它允许我在多个 GPU 上进行并行处理

非常感谢。

编辑：更具体地说，我正在 Pytorch-Lightning 中寻找一个 multiprocessing 模块，它允许我在多个 GPU 上并行处理非神经网络计算，例如：

import numpy as np
import torch
from torch.multiprocessing import Pool

X = np.array([[1, 3, 2, 3], [2, 3, 5, 6], [1, 2, 3, 4]])
X = torch.DoubleTensor(X)

def X_power_func(j):
    X_power = X.cuda()**j
    return X_power

if __name__ == '__main__':
  with Pool(processes = 2) as p:   # Parallelizing over 2 GPUs
    results = p.map(X_power_func, range(4))

results

【问题讨论】：

我也在想同样的事情。基本上我需要运行非训练过程。 【参考方案1】：

是的，基本上你所要做的就是为Trainer 提供适当的参数gpus=N 并指定后端：

# train on 8 GPUs (same machine (ie: node))
trainer = Trainer(gpus=8, distributed_backend='ddp')

# train on 32 GPUs (4 nodes)
trainer = Trainer(gpus=8, distributed_backend='ddp', num_nodes=4)

您可以在multi-GPU training documentation 中了解更多信息。

编辑：

您实际寻找的是distributed 模块而不是multiprocessing，通常建议使用torch.distributed.DistributedDataParallel 在多个GPU 上进行并行处理。

【讨论】：

谢谢@Szymon Maszke。我在 Pytorch-Lightning 中寻找 multiprocessing 模块的原因是我可以在非神经网络上的多个 GPU 上并行化。如果我的理解是正确的，`Trainer()' 允许您在神经网络模型上的 GPU 上并行化，而不是在非神经网络的东西上。我已经用一个例子更新了我的帖子。对不起，我应该更清楚。你知道 Pytorch-Lightning 是否有一个模块可以让我在多个 GPU 上并行处理非神经网络的东西？

以上是关于Pytorch-Lightning 是不是具有多处理（或 Joblib）模块？的主要内容，如果未能解决你的问题，请参考以下文章