AttributeError:“tensorflow.python.framework.ops.EagerTensor”对象没有属性“to_tensor”
Posted
技术标签:
【中文标题】AttributeError:“tensorflow.python.framework.ops.EagerTensor”对象没有属性“to_tensor”【英文标题】:AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'to_tensor' 【发布时间】:2021-12-03 06:16:38 【问题描述】:我正在使用 Hugging Face、Keras、Tensorflow 库微调 BERT 模型。
从昨天开始,我在 Google Colab 中运行我的代码时遇到了这个错误。奇怪的是,以前运行的代码没有任何问题,突然开始抛出这个错误。更令人怀疑的是,代码在我的 Apple M1 tensorflow 配置中运行没有问题。同样,我没有对我的代码进行任何更改,但现在代码无法在 Google Colab 中运行,尽管它过去运行时没有任何问题。
两个环境都有 tensorflow 2.6.0
error_screenshot
我创建了以下代码以重现错误。我希望你能对此有所了解。
!pip install transformers
!pip install datasets
import pandas as pd
import numpy as np
import tensorflow as tf
from transformers import AutoTokenizer
from datasets import Dataset
# dummy sentences
sentences = ['the house is blue and big', 'this is fun stuff','what a horrible thing to say']
# create a pandas dataframe and converto to Hugging Face dataset
df = pd.DataFrame('Text': sentences)
dataset = Dataset.from_pandas(df)
#download bert tokenizer
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
# tokenize each sentence in dataset
dataset_tok = dataset.map(lambda x: tokenizer(x['Text'], truncation=True, padding=True, max_length=10), batched=True)
# remove original text column and set format
dataset_tok = dataset_tok.remove_columns(['Text']).with_format('tensorflow')
# extract features
features = x: dataset_tok[x].to_tensor() for x in tokenizer.model_input_names
【问题讨论】:
两个环境下tensorflow的版本是否相同? 是的。两种环境都有 tensorflow 2.6.0 感谢.with_format('tensorflow')
,您的数据集已经充满了 tf 张量。如果您希望获得张量,只需删除 .to_tensor() 或删除 .with_format('tensorflow')
并使用 tf.convert_to_tensor(dataset_tok[x])
?
谢谢@HaroldG。我删除了to_tensor()
并且运行良好。我现在看到该声明是多余的。尽管这是 Hugging Face 官方文档 (huggingface.co/transformers/training.html) 中建议的程序,但 TensorFlow 直到现在才抛出错误。无论如何,我很高兴现在正在运行。谢谢!
【参考方案1】:
删除to_tensor()
后,给定代码按照@Harold G 的建议工作。
!pip install transformers
!pip install datasets
import pandas as pd
import numpy as np
import tensorflow as tf
from transformers import AutoTokenizer
from datasets import Dataset
# dummy sentences
sentences = ['the house is blue and big', 'this is fun stuff','what a horrible thing to say']
# create a pandas dataframe and converto to Hugging Face dataset
df = pd.DataFrame('Text': sentences)
dataset = Dataset.from_pandas(df)
#download bert tokenizer
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
# tokenize each sentence in dataset
dataset_tok = dataset.map(lambda x: tokenizer(x['Text'], truncation=True, padding=True, max_length=10), batched=True)
# remove original text column and set format
dataset_tok = dataset_tok.remove_columns(['Text']).with_format('tensorflow')
# extract features
features = x: dataset_tok[x] for x in tokenizer.model_input_names
【讨论】:
以上是关于AttributeError:“tensorflow.python.framework.ops.EagerTensor”对象没有属性“to_tensor”的主要内容,如果未能解决你的问题,请参考以下文章
Windows下Pycharm安装Tensorflow:ERROR: Could not find a version that satisfies the requirement tensorflo
初学者 Python:AttributeError:'list' 对象没有属性