批量大小未传递给 tf.keras 模型：“检查输入时出错：预期 input1 有 3 个维度，但得到的数组形状为 (a,b)”

Posted 2023-02-16

技术标签:

【中文标题】批量大小未传递给 tf.keras 模型：“检查输入时出错：预期 input1 有 3 个维度，但得到的数组形状为 (a,b)”【英文标题】：Batch size not passed to tf.keras model: "Error when checking input: expected input1 to have 3 dimensions, but got array with shape (a,b)" 【发布时间】：2021-10-03 07:06:20 【问题描述】：

我是 tensorflow (v 2.4.1) 的新手，所以这可能很简单，但我自己无法弄清楚。我通过 tf.data.Dataset 将二维 (30,1024) 张量传递给我的 2 输入 tf.keras 模型。批处理后，数据集打印为

<BatchDataset shapes: (sentence1: (None, 30, 1024), sentence2: (None, 30, 1024), (None, 1)), types: (sentence1: tf.float32, sentence2: tf.float32, tf.int64)>

模型的相关部分是：

shared_model = tf.keras.Sequential([
                layers.Masking(),
                layers.GlobalAveragePooling1D()])

input_1 = tf.keras.Input(shape=(30,1024), dtype=tf.float32, name='sentence1')
input_2 = tf.keras.Input(shape=(30,1024), dtype=tf.float32, name='sentence2')

encoder1 = shared_model(input_1)
encoder2 = shared_model(input_2)
...
model = tf.keras.Model(inputs=[input_1,input_2], outputs=final_layer)

但是，当我调用 model.fit() 时，我收到错误警告“检查输入时出错：预期的 sentence1 具有 3 个维度，但得到的数组的形状为 (30, 1024)”。也就是说，批量大小不会传递给模型。

我尝试将张量重塑为 (1,30,1024)。然后数据集变为

<BatchDataset shapes: (sentence1: (None, 1, 30, 1024), sentence2: (None, 1, 30, 1024), (None, 1)), types: (sentence1: tf.float32, sentence2: tf.float32, tf.int64)>

但是，现在我收到错误“检查输入时出错：预期的 sentence1 有 3 个维度，但得到的数组的形状为 (None, 1, 30, 1024)”。所以现在批量大小突然被传递给模型。知道为什么会这样吗？谢谢一百万。

编辑：我认为问题首先在于数据集的生成方式。我通过这些辅助函数从 TFRecord 文件中获得它：

def load_dataset(filename):
    raw_dataset = tf.data.TFRecordDataset([filename])
    dataset = raw_dataset.map(prepare_dataset_for_training)
    return dataset

def prepare_dataset_for_training(example):
    context_features = 
        'label': tf.io.FixedLenFeature([],tf.int64)
    sequence_features = 
        'embeddings1': tf.io.VarLenFeature(tf.float32),
        'embeddings2': tf.io.VarLenFeature(tf.float32)
    parsed_context, parsed_feature_lists = tf.io.parse_single_sequence_example(
        example,
        context_features=context_features,
        sequence_features=sequence_features)
    emb1 = tf.RaggedTensor.from_sparse(parsed_feature_lists['embeddings1'])
    emb1 = tf.reshape(emb1.to_tensor(), shape=(30,1024))
    emb2 = tf.RaggedTensor.from_sparse(parsed_feature_lists['embeddings2'])
    emb2 = tf.reshape(emb2.to_tensor(), shape=(30,1024))
    label = tf.expand_dims(parsed_context['label'], axis=0)
    return ('sentence1': emb1, 'sentence2': emb2, label)

【问题讨论】：

【参考方案1】：

我不太确定可能是什么问题，因为我无法重现它。也许你在你的 model.fit 调用上打错字了，说要训练 2D 输入而不是 3D 输入？

这是我为重现结果而运行的代码：

from tensorflow.keras import layers
from tensorflow import keras
import numpy.random as npr
import tensorflow as tf

shared_model = keras.Sequential([
                layers.Masking(),
                layers.GlobalAveragePooling1D()])

input_1 = keras.Input(shape=(30,1024), dtype=tf.float32, name='sentence1')
input_2 = keras.Input(shape=(30,1024), dtype=tf.float32, name='sentence2')
x = tf.concat((input_1, input_2), axis=1)
x = layers.GlobalAveragePooling1D()(x)
x = layers.Dense(8)(x)

model = keras.Model(inputs=[input_1,input_2], outputs=x)

m = 40
BATCH_SIZE = 4
inp_1 = npr.randn(m, 30, 1024)
inp_2 = npr.randn(m, 30, 1024)
y = npr.uniform(size=(m, 8))
dataset = tf.data.Dataset.from_tensor_slices(('sentence1': inp_1, 'sentence2': inp_2, y)).batch(BATCH_SIZE) # shape is (None, 30, 1024), (None, 30, 1024)
model.compile('adam', loss='mse')
model.fit(dataset, epochs=100)
pred = model.predict([inp_1, inp_2])[0]

【讨论】：

非常感谢您的深思熟虑的评论。感谢您的反馈，我现在认为这个问题实际上可能比我一开始想的要复杂，因为我的数据集是通过 tf.data.TFRecordDataset 从 TFRecord 文件中获得的。批处理一定有一些冲突。我在编辑中将详细信息添加到问题中。

以上是关于批量大小未传递给 tf.keras 模型：“检查输入时出错：预期 input1 有 3 个维度，但得到的数组形状为 (a,b)”的主要内容，如果未能解决你的问题，请参考以下文章