由于使用了 wandb,PyTorch Lightning 想要在导入时创建一个文件夹,这会在 AWS Lambda 上引发错误

Posted

技术标签:

【中文标题】由于使用了 wandb,PyTorch Lightning 想要在导入时创建一个文件夹,这会在 AWS Lambda 上引发错误【英文标题】:PyTorch Lightning wants create a folder on import due to usage of wandb, which raises error on AWS Lambda 【发布时间】:2021-06-10 20:21:47 【问题描述】:

所以我想用 PyTorch Lightning 构建一个可与 AWS lambda 一起使用的 Docker 映像。但是,当调用该函数时,它会引发一个操作系统错误,声称它使用只读文件系统并且 wandb.py 想要写一些东西。

我试过这些东西:

    覆盖 pytroch 闪电的 wandb.py 文件,它不会初始化 wandb --> 引发错误 在 Dockerfile 中执行 python 脚本,文件在 docker build 上创建并存在,当调用 lambda 函数时 --> 相同的操作系统错误

有人知道跳过 wandb.py 的方法吗?

这是错误信息:

START RequestId: ddae284d-4f32-4dc6-8160-d1fa62ba9772 Version: $LATEST
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
[ERROR] OSError: [Errno 30] Read-only file system: '/home/sbx_user1051'
Traceback (most recent call last):
  File "/var/lang/lib/python3.8/imp.py", line 234, in load_module
    return load_source(name, filename, file)
  File "/var/lang/lib/python3.8/imp.py", line 171, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 702, in _load
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/var/task/inference.py", line 5, in <module>
    import pytorch_lightning as pl
  File "/var/lang/lib/python3.8/site-packages/pytorch_lightning/__init__.py", line 63, in <module>
    from pytorch_lightning.callbacks import Callback
  File "/var/lang/lib/python3.8/site-packages/pytorch_lightning/callbacks/__init__.py", line 25, in <module>
    from pytorch_lightning.callbacks.swa import StochasticWeightAveraging
  File "/var/lang/lib/python3.8/site-packages/pytorch_lightning/callbacks/swa.py", line 26, in <module>
    from pytorch_lightning.trainer.optimizers import _get_default_scheduler_config
  File "/var/lang/lib/python3.8/site-packages/pytorch_lightning/trainer/__init__.py", line 18, in <module>
    from pytorch_lightning.trainer.trainer import Trainer
  File "/var/lang/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 30, in <module>
    from pytorch_lightning.loggers import LightningLoggerBase
  File "/var/lang/lib/python3.8/site-packages/pytorch_lightning/loggers/__init__.py", line 31, in <module>
    from pytorch_lightning.loggers.wandb import _WANDB_AVAILABLE, WandbLogger  # noqa: F401
  File "/var/lang/lib/python3.8/site-packages/pytorch_lightning/loggers/wandb.py", line 34, in <module>
    import wandb
  File "/var/lang/lib/python3.8/site-packages/wandb/__init__.py", line 131, in <module>
    api = InternalApi()
  File "/var/lang/lib/python3.8/site-packages/wandb/apis/internal.py", line 17, in __init__
    self.api = InternalApi(*args, **kwargs)
  File "/var/lang/lib/python3.8/site-packages/wandb/sdk/internal/internal_api.py", line 73, in __init__
    self._settings = Settings(
  File "/var/lang/lib/python3.8/site-packages/wandb/old/settings.py", line 25, in __init__
    self._global_settings.read([Settings._global_path()])
  File "/var/lang/lib/python3.8/site-packages/wandb/old/settings.py", line 105, in _global_path
    util.mkdir_exists_ok(config_dir)
  File "/var/lang/lib/python3.8/site-packages/wandb/util.py", line 687, in mkdir_exists_ok
    os.makedirs(path)
  File "/var/lang/lib/python3.8/os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/var/lang/lib/python3.8/os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/var/lang/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)
END RequestId: ddae284d-4f32-4dc6-8160-d1fa62ba9772
REPORT RequestId: ddae284d-4f32-4dc6-8160-d1fa62ba9772  Duration: 27000.33 ms   Billed Duration: 27001 ms   Memory Size: 10240 MB   Max Memory Used: 241 MB 
Unknown application error occurred

【问题讨论】:

【参考方案1】:

您需要确保您在某处具有写入权限。

然后您可以使用wandb environment variables 修改本地保存文件的默认位置,特别是查看WANDB_DIR、WANDB_CONFIG_DIR 和WANDB_CACHE_DIR。

【讨论】:

以上是关于由于使用了 wandb,PyTorch Lightning 想要在导入时创建一个文件夹,这会在 AWS Lambda 上引发错误的主要内容,如果未能解决你的问题,请参考以下文章

wandb(w&b)(weights and biases): 深度学习轻量级可视化工具入门教程

Wandb指南 by 算法美食屋 - 知识点目录

wandb不可缺少的机器学习分析工具

wandb不可缺少的机器学习分析工具

wandb不可缺少的机器学习分析工具

light-reid:轻量化行人重识别开源工具箱