处理仅由某些 Python 类使用的导入的最佳方法

Posted 2023-02-23

技术标签:

【中文标题】处理仅由某些 Python 类使用的导入的最佳方法【英文标题】：Best way to handle imports used by only some Python classes 【发布时间】：2021-09-10 03:47:11 【问题描述】：

我正在制作一个 Python 包，在一个模块内，我有几个 Python 类，但其中只有一个使用特定包（tensorflow），它是使用setup.py 文件中的extras_require 选项安装的，因为这是一个严重的依赖，它只用于项目的一小部分。

假设我在同一个模块中有类MyClassRegular和MyClassTF，只有第二个需要tensorflow，我是在顶层导入包文件使用：

try:
    import tensorflow as tf
except ModuleNotFoundError:
    logging.error("Tensorflow not found, pip install tensorflow to use MyClassTF")

所以这会带来两个问题：

如果作为用户，我正在导入 MyClassRegular，它会针对我什至不需要或关心的包发出警告，因为我正在使用与 tensorflow 无关的功能如果出于某种原因，我安装了 tensorflow，它可能会开始发出警告消息，例如 cuda 版本不正确，或未找到 GPU 等，这与 MyClassRegular 无关。

所以想到的是在 MyClassTF 中导入包，我知道这可能会以某种方式与PEP 8 相悖，但我没有找到更好的处理方法。所以尝试一下这个选项，我遇到的问题是，如果我在 init 上导入模块，则类方法无法识别它：

class MyClassTF:
    def __init__(self):
        try:
            import tensorflow as tf
        except ModuleNotFoundError: 
            logging.error("Tensorflow not found, pip install tensorflow to use MyClassTF") 

    def train(self):
        print(tf.__version__) # <--- tensorflow it's not recognized here

    def predict(self):
        print(tf.__version__) # <--- Again, not recognized

我可以将 tensorflow 分配给这样的变量，但感觉不对：

class MyClassTF:
    def __init__(self):
        try:
            import tensorflow as tf
            self.tf = tf
        except ModuleNotFoundError: 
            logging.error("Tensorflow not found, pip install tensorflow")

那么，处理这个问题的最佳 Python 方法是什么？

编辑： MyClassRegular 和 MyClassTF 都使用

导入到顶部的 __init__.py 文件中

__all__ = ["MyClassRegular", "MyClassTF"]

【问题讨论】：

将类分离到不同的文件不是一种选择吗？我想将它们放在一起，因为它们与同一组功能相关 【参考方案1】：

为了避免每次实例化 MyTF 时测试 tf 的开销，我会这样进行：

try:
    import tensorflow as tf
    class MyTF(object):
        ...

except ImportError:
    class MyTF(object):
        def __init__(self, *args, **kwargs):
            raise RuntimeError("tensorflow library not available, "
                               "please install it to enable MyTF functionalities")

或者如果MyTF 是一个长代码，为了提高可读性，将依赖于tensorflow 的所有内容放在_internal_tf.py 模块中，然后：

try:
    from ._internal_tf import MyTF
except ImportError:
    class MyTF(object):
        def __init__(self, *args, **kwargs):
            raise RuntimeError("tensorflow library not available, "
                               "please install it to enable MyTF functionalities")

【讨论】：

【参考方案2】：

Huumm 不是最直接的事情，您的解决方案看起来还不错。但是，我会尝试将类放在一个单独的文件中，该文件不会在您的包的其他地方导入。在模块级别使用相同的尝试代码，然后用户只有在尝试导入该包时才会看到该错误。也像这样，我认为你的 python linter 应该很高兴。

几个包使用它来控制行为，例如fuzzywuzzy

try:
    from .StringMatcher import StringMatcher as SequenceMatcher
except ImportError:
    if platform.python_implementation() != "PyPy":
        warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
    from difflib import SequenceMatcher

https://github.com/seatgeek/fuzzywuzzy/blob/9e3d2fe0d8c1b195696d5fbcda78c371dd4a6b8f/fuzzywuzzy/fuzz.py#L7

【讨论】：

我认为它可以工作，现在我考虑了一下，它仍然会发出这些警告，因为类是在 __ init __.py 文件中导入的【参考方案3】：

另一种方式，如果您想延迟发出警告，直到实际使用该类（我认为这是您想要达到的目标）是什么

try:
    import tensorflow as tf
except ImportError:
    # Allow the ImportError to pass silently and just assign tf to None
    tf = None


class MyTF:
    def __init__(self):
        if tf is None:
            warnings.warn('pip install tensorflow to use this class')

或类似的东西。无需在方法体本身中进行导入，或将 tensorflow 模块分配给实例属性，这是可行的，但非常不寻常。上面的模式比较常见。

【讨论】：

以上是关于处理仅由某些 Python 类使用的导入的最佳方法的主要内容，如果未能解决你的问题，请参考以下文章