Jupyter 笔记本中的 ModuleNotFoundError 拥抱脸数据集

Posted

技术标签:

【中文标题】Jupyter 笔记本中的 ModuleNotFoundError 拥抱脸数据集【英文标题】:ModuleNotFoundError huggingface datasets in Jupyter notebook 【发布时间】:2021-09-12 17:36:04 【问题描述】:

我想在 Jupyter 笔记本中使用 huggingface 数据集库。

这应该像安装它(pip install datasets,在 venv 中的 bash 中)并导入它(import datasets,在 Python 或笔记本中)一样简单。

当我在标准 Python 交互式 shell 中测试它时一切正常,但是,在 Jupyter 笔记本中尝试时,它说:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-6-652e886d387f> in <module>
----> 1 import datasets

ModuleNotFoundError: No module named 'datasets'

一开始我以为笔记本内核使用不同的虚拟环境可能是这种情况,但我从笔记本内部验证了该软件包已安装:

!pip install datasets

Requirement already satisfied: datasets in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (1.8.0)
Requirement already satisfied: numpy>=1.17 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (1.21.0)
Requirement already satisfied: xxhash in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (2.0.2)
Requirement already satisfied: pyarrow<4.0.0,>=1.0.0 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (3.0.0)
Requirement already satisfied: pandas in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (1.2.5)
Requirement already satisfied: fsspec in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (2021.6.1)
Requirement already satisfied: packaging in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (20.9)
Requirement already satisfied: dill in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (0.3.4)
Requirement already satisfied: requests>=2.19.0 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (2.25.1)
Requirement already satisfied: tqdm<4.50.0,>=4.27 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (4.49.0)
Requirement already satisfied: multiprocess in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (0.70.12.2)
Requirement already satisfied: huggingface-hub<0.1.0 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (0.0.13)
Requirement already satisfied: pytz>=2017.3 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from pandas->datasets) (2021.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from pandas->datasets) (2.8.1)
Requirement already satisfied: pyparsing>=2.0.2 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from packaging->datasets) (2.4.7)
Requirement already satisfied: certifi>=2017.4.17 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from requests>=2.19.0->datasets) (2021.5.30)
Requirement already satisfied: chardet<5,>=3.0.2 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from requests>=2.19.0->datasets) (4.0.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from requests>=2.19.0->datasets) (1.26.6)
Requirement already satisfied: idna<3,>=2.5 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from requests>=2.19.0->datasets) (2.10)
Requirement already satisfied: typing-extensions in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from huggingface-hub<0.1.0->datasets) (3.10.0.0)
Requirement already satisfied: filelock in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from huggingface-hub<0.1.0->datasets) (3.0.12)
Requirement already satisfied: six>=1.5 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas->datasets) (1.16.0)

!pip freeze

certifi==2021.5.30
chardet==4.0.0
datasets==1.8.0
dill==0.3.4
filelock==3.0.12
fsspec==2021.6.1
huggingface-hub==0.0.13
idna==2.10
multiprocess==0.70.12.2
numpy==1.21.0
packaging==20.9
pandas==1.2.5
pyarrow==3.0.0
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2021.1
requests==2.25.1
six==1.16.0
tqdm==4.49.0
typing-extensions==3.10.0.0
urllib3==1.26.6
xxhash==2.0.2

有什么想法吗?我需要以特殊方式配置笔记本,还是数据集模块有问题?谢谢!


编辑:按照下面的答案,这会使错误消失:

datasets_dir=r"/home/yoga/venvs/text_embeddings/lib/python3.8/site-packages/datasets"

import sys
sys.path.append(datasets_dir)

import datasets

但是有没有一种方法可以在不明确设置此路径的情况下工作? (或者有人可以在这里解释为什么这是必要的吗?)

【问题讨论】:

我刚刚在虚拟环境中进行了测试,无法复制该问题。什么是完整的回溯? 谢谢!我用完整的回溯更新了这个问题。正如我所写的,问题只发生在笔记本中,而不是交互式外壳中。 尝试使用 conda 创建一个新环境:conda create -n py39_test_env python=3.9 然后激活 conda activate py39_test_env 然后安装 pip install datasets 然后启动 jupyter jupyter notebook 谢谢 - 忘了说,我使用的是标准 python 而不是 conda。它只适用于 conda 吗? 【参考方案1】:

我遇到过类似的问题,但在另一个库中,这对我有用

import syssys.path.append(r"path to datasets in python env")import dataset_utils

您的情况下的路径->“/home/yoga/venvs/text_embeddings/lib/python3.8/site-packages/datasets”

我的猜测是环境变量没有PYTHONPATH 设置不正确。 PYTHONPATH 是一个环境变量,这些内容被添加到 Python 查找模块的 sys.path 中。您可以将其设置为您喜欢的任何内容

这应该有效!

【讨论】:

这行得通!你能解释一下那里发生了什么吗?我几乎不敢相信这是使用数据集包的推荐方式... 我的猜测是环境变量没有 PYTHONPATH 设置不正确。 PYTHONPATH 是一个环境变量,这些内容被添加到 Python 查找模块的 sys.path 中。您可以将其设置为您喜欢的任何内容。

以上是关于Jupyter 笔记本中的 ModuleNotFoundError 拥抱脸数据集的主要内容,如果未能解决你的问题,请参考以下文章

Jupyter Lab 中的 Jupyter Notebook 扩展

Jupyter 笔记本中的 BeautifulSoup 和 lxml

使用 %matplotlib 笔记本时修复 Jupyter 笔记本中的绘图

下载 Jupyter 笔记本服务器上路径中的所有文件

jupyter笔记本中的Ipython错误,没有堆栈跟踪

在jupyter笔记本中的pyomo scriting