TensorFlow ValueError:无法将 NumPy 数组转换为张量(不支持的对象类型列表)

Posted

技术标签:

【中文标题】TensorFlow ValueError:无法将 NumPy 数组转换为张量(不支持的对象类型列表)【英文标题】:TensorFlow ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type list) 【发布时间】:2020-12-08 09:36:36 【问题描述】:

我正在尝试将 this code 写入 colab。有趣的是,几天前我在 colab 中运行了相同的代码,但现在它不起作用。该代码也适用于 kaggle 内核。我尝试更改 TensorFlow 版本,但它们都给出了不同的错误。为什么你认为我不能运行这段代码?这是colab notebook,如果您需要更多信息。 提前致谢!

类 DisasterDetector:

def __init__(self, tokenizer, bert_layer, max_len =30, lr = 0.0001,
             epochs = 15, batch_size = 32, dtype = tf.int32 ,
             activation = 'sigmoid', optimizer = 'SGD',
             beta_1=0.9, beta_2=0.999, epsilon=1e-07,
             metrics = 'accuracy', loss = 'binary_crossentropy'):
    
    self.lr = lr
    self.epochs = epochs
    self.max_len = max_len
    self.batch_size = batch_size
    self.tokenizer = tokenizer
    self.bert_layer = bert_layer
    self.models = []

    self.activation = activation
    self.optimizer = optimizer
    self.dtype = dtype
    
    self.beta_1 = beta_1
    self.beta_2 = beta_2
    self.epsilon =epsilon
    
    self.metrics = metrics
    self.loss = loss
    
def encode(self, texts):
    all_tokens = []
    masks = []
    segments = []
    
    for text in texts:
        
        tokenized = self.tokenizer.convert_tokens_to_ids(['[CLS]'] + self.tokenizer.tokenize(text) + ['[SEP]'])
        
        len_zeros = self.max_len - len(tokenized)
        
        
        padded = tokenized + [0] * len_zeros
        mask = [1] * len(tokenized) + [0] * len_zeros
        segment = [0] * self.max_len
        
        all_tokens.append(padded)
        masks.append(mask)
        segments.append(segment)
        
    print(len(all_tokens[0]))
    return np.array(all_tokens), np.array(masks), np.array(segments)
    
def make_model(self):
    

    input_word_ids = Input(shape = (self.max_len, ), dtype=tf.int32,
                        name = 'input_word_ids')
    
    input_mask = Input(shape = (self.max_len, ), dtype=tf.int32,
                       name = 'input_mask')
    
    segment_ids = Input(shape = (self.max_len, ), dtype=tf.int32,
                        name = 'segment_ids')


    #pooled output is the output of dimention and

    pooled_output, sequence_output = self.bert_layer([input_word_ids,
                                                 input_mask,
                                                 segment_ids])

    clf_output = sequence_output[:, 0, :]
    out = tf.keras.layers.Dense(1, activation = self.activation)(clf_output)
    #out = tf.keras.layers.Dense(1, activation = 'sigmoid', input_shape =  (clf_output,) )(clf_output)
    

    model = Model(inputs = [input_word_ids, input_mask, segment_ids],
                  outputs = out)
    if self.optimizer is 'SGD':
        optimizer = SGD(learning_rate = self.lr)

    elif self.optimizer is 'Adam': 
        optimizer = Adam(learning_rate = self.lr, beta_1=self.beta_1,
                         beta_2=self.beta_2, epsilon=self.epsilon)

    model.compile(loss = self.loss, optimizer = self.optimizer,
                  metrics = [self.metrics])
    
    return model




def train(self, x, k = 3):    
    kfold = StratifiedKFold(n_splits = k, shuffle = True)


    for fold, (train_idx, val_idx) in enumerate(kfold.split(x['cleaned_text'], x['target'])):
        print('fold: ', fold)

        x_trn = self.encode(x.loc[train_idx, 'cleaned_text'])
        x_val = self.encode(x.loc[val_idx, 'cleaned_text'])
        y_trn = np.array(x.loc[train_idx, 'target'], dtype = np.uint8)
        y_val = np.array(x.loc[val_idx, 'target'], dtype = np.uint8)
        print('the data type of y train: ', type(y_trn))
        print('x_val shape', x_val[0].shape)
        print('x_trn shape', x_trn[0].shape)
        
        model = self.make_model()
        print('model made.')
        model.fit(x_trn, tf.convert_to_tensor(y_trn),
                validation_data = (x_val, tf.convert_to_tensor(y_val)),
                batch_size=self.batch_size, epochs = self.epochs)

        self.models.append(model)

在调用类的 train 函数后,我得到了那个错误。

classifier = DisasterDetector(tokenizer = tokenizer, bert_layer = bert_layer, max_len = max_len, lr = 0.0001,
                  epochs = 10,  activation = 'sigmoid',
                batch_size = 32,optimizer = 'SGD',
                beta_1=0.9, beta_2=0.999, epsilon=1e-07)
classifier.train(train_cleaned)

这是错误:

ValueError                                Traceback (most 

recent call last)
<ipython-input-10-106c756f2e47> in <module>()
----> 1 classifier.train(train_cleaned)

8 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/constant_op.py in convert_to_eager_tensor(value, ctx, dtype)
     96       dtype = dtypes.as_dtype(dtype).as_datatype_enum
     97   ctx.ensure_initialized()
---> 98   return ops.EagerTensor(value, ctx.device_name, dtype)
     99 
    100 

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type list).

【问题讨论】:

请提供代码,而不仅仅是链接。按照标题,您正在尝试转换列表而不是 numpy 数组。 代码比较长,所以我认为最好提供链接。但它就在这里。 见这里:***.com/help/minimal-reproducible-example 感谢链接,我会阅读的。但是现在我添加了在主要问题中给我错误的代码。 您能否为y_trny_val 设置dtype=np.float32 而不是np.uint8 【参考方案1】:

好吧,事实证明,由于没有给出适当的最大序列长度,TensorFlow 会抛出这个错误。通过将 max_len 变量更改为 54,我可以毫无困难地运行我的程序。所以问题不在于输入的类型或 numpy 数组。

【讨论】:

请您帮忙***.com/questions/68141489/…

以上是关于TensorFlow ValueError:无法将 NumPy 数组转换为张量(不支持的对象类型列表)的主要内容,如果未能解决你的问题,请参考以下文章

“ValueError:无法将 NumPy 数组转换为张量(不支持的对象类型 numpy.ndarray)。在 TensorFlow CNN 中进行图像分类

如何修复Tensorflow中的“ValueError:操作数无法与形状(2592,)(4,)一起广播”?

Tensorflow 数据适配器错误:ValueError:无法找到可以处理输入的数据适配器

自定义 DataGenerator tensorflow 错误“ValueError:无法找到可以处理输入的数据适配器”

TensorFlow:ValueError:形状不兼容

自己的数据集:ValueError:无法将 NumPy 数组转换为张量(不支持的对象类型 int)