TorchScript 需要源访问才能对 collections.deque 进行编译

Posted 2023-03-27

技术标签:

【中文标题】TorchScript 需要源访问才能对 collections.deque 进行编译【英文标题】：TorchScript requires source access in order to carry out compilation for collections.deque 【发布时间】：2021-06-12 04:28:49 【问题描述】：

我正在尝试将 PyTorch FOMM 模型转换为 TorchScript。当我开始用 @torch.jit.script 注释一些类时，我遇到了一个错误：

OSError: Can't get source for <class 'collections.deque'>. TorchScript requires source access in order to carry out compilation, make sure original .py files are available.

据我了解，CPython 中实现的类因此无法被 TorchScript 编译器读取。我没有找到任何纯 Python 实现。我该如何克服这个问题？

这是我要注释的类：

import queue
import collections
import threading
import torch

@torch.jit.script
class SyncMaster(object):
    """An abstract `SyncMaster` object.

    - During the replication, as the data parallel will trigger an callback of each module, all slave devices should
    call `register(id)` and obtain an `SlavePipe` to communicate with the master.
    - During the forward pass, master device invokes `run_master`, all messages from slave devices will be collected,
    and passed to a registered callback.
    - After receiving the messages, the master device should gather the information and determine to message passed
    back to each slave devices.
    """

    def __init__(self, master_callback):
        """

        Args:
            master_callback: a callback to be invoked after having collected messages from slave devices.
        """
        self._master_callback = master_callback
        self._queue = queue.Queue()
        self._registry = collections.OrderedDict()
        self._activated = False

    def __getstate__(self):
        return 'master_callback': self._master_callback

    def __setstate__(self, state):
        self.__init__(state['master_callback'])

    def register_slave(self, identifier):
        """
        Register an slave device.

        Args:
            identifier: an identifier, usually is the device id.

        Returns: a `SlavePipe` object which can be used to communicate with the master device.

        """
        if self._activated:
            assert self._queue.empty(), 'Queue is not clean before next initialization.'
            self._activated = False
            self._registry.clear()
        future = FutureResult()
        self._registry[identifier] = _MasterRegistry(future)
        return SlavePipe(identifier, self._queue, future)

    def run_master(self, master_msg):
        """
        Main entry for the master device in each forward pass.
        The messages were first collected from each devices (including the master device), and then
        an callback will be invoked to compute the message to be sent back to each devices
        (including the master device).

        Args:
            master_msg: the message that the master want to send to itself. This will be placed as the first
            message when calling `master_callback`. For detailed usage, see `_SynchronizedBatchNorm` for an example.

        Returns: the message to be sent back to the master device.

        """
        self._activated = True

        intermediates = [(0, master_msg)]
        for i in range(self.nr_slaves):
            intermediates.append(self._queue.get())

        results = self._master_callback(intermediates)
        assert results[0][0] == 0, 'The first result should belongs to the master.'

        for i, res in results:
            if i == 0:
                continue
            self._registry[i].result.put(res)

        for i in range(self.nr_slaves):
            assert self._queue.get() is True

        return results[0][1]

    @property
    def nr_slaves(self):
        return len(self._registry)

【问题讨论】：

【参考方案1】：

将 TorchScript 生成方法从 torch.jit.script 切换到 torch.jit.trace 并且有效，无需注释任何内容。或者torch.onnx.export 有时也可以。

【讨论】：

在哪里做这个

以上是关于TorchScript 需要源访问才能对 collections.deque 进行编译的主要内容，如果未能解决你的问题，请参考以下文章