Tensorflow TextVectorization 在 model.summary() 中带来 None 形状

Posted 2023-05-07

技术标签:

【中文标题】Tensorflow TextVectorization 在 model.summary() 中带来 None 形状【英文标题】：Tensorflow TextVectorization brings None shape in model.summary() 【发布时间】：2021-05-13 14:30:45 【问题描述】：

我正在使用来自 TextVectorization 类的 TextVectorization 对象的 encoder。然后我像这样调整我的火车数据：

encoder = tf.keras.layers.experimental.preprocessing.TextVectorization(max_tokens=1000)
encoder.adapt(dataset_all.map(lambda text, label: text))

然后我想运行一个具有密集层的简单神经网络。这是我的模型

model = tf.keras.Sequential([
                  tf.keras.Input(shape=(1,), dtype=tf.string),
                  encoder,
                  tf.keras.layers.Embedding(len(encoder.get_vocabulary())+1
                           ,output_dim=64,mask_zero=True),
                  tf.keras.layers.Dense(64, activation='relu'),
                  tf.keras.layers.Dense(4, activation='softmax')
])

当我打印摘要时，我得到以下信息：

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
text_vectorization (TextVect multiple                  0         
_________________________________________________________________
embedding_39 (Embedding)     (None, None, 64)          6142528   
_________________________________________________________________
dense_74 (Dense)             (None, None, 64)          4160      
_________________________________________________________________
dense_75 (Dense)             (None, None, 4)           260       
=================================================================
Total params: 6,146,948
Trainable params: 6,146,948
Non-trainable params: 0

我不明白每个输出昏暗中的第二个None 代表什么。此外，当我尝试拟合模型时，我收到一条错误消息（正在使用 sparseCategoricalCrossEntropy loss ）：

assertion failed: [Condition x == y did not hold element-wise:] [x (sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/Shape_1:0) = ] [64 1] [y (sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/strided_slice:0) = ] [64 69]

由于我需要最终密集层的二维输出，因此我尝试在嵌入层之后添加一个展平层，但是由于未指定密集输入的第二维，因此它不起作用。

如果我在嵌入层之后添加一个 RNN 层，由于嵌入层的输出是 3D，因此网络可以正确训练，但是，我不明白如何只有密集层。

【问题讨论】：

嵌入后添加GlobalAveragePooling1D怎么样？ 【参考方案1】：

这是因为您没有指定指示encoder 的输出形状的参数，即output_sequence_length。

output_sequence_length：如果设置，输出的时间维度将被填充或截断为精确的 output_sequence_length 值，导致形状为 [batch_size, output_sequence_length] 的张量，无论拆分产生多少令牌步。默认为无。

如果你把它设置为一个数字，你会看到层的输出形状会被定义：

encoder = tf.keras.layers.experimental.preprocessing.TextVectorization(
    max_tokens=1000, 
    output_sequence_length=200
)

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
text_vectorization_3 (TextVe (None, 200)               0         
_________________________________________________________________
embedding_2 (Embedding)      (None, 200, 64)           448       
_________________________________________________________________
dense_4 (Dense)              (None, 200, 64)           4160      
_________________________________________________________________
dense_5 (Dense)              (None, 200, 4)            260       
=================================================================
Total params: 4,868
Trainable params: 4,868
Non-trainable params: 0
_________________________________________________________________

之后，您可以使用GlobalAveragePooling1D 层进行二维输出。

阅读docs

【讨论】：

以上是关于Tensorflow TextVectorization 在 model.summary() 中带来 None 形状的主要内容，如果未能解决你的问题，请参考以下文章

如何让 Tensorflow Profiler 在 Tensorflow 2.5 中使用“tensorflow-macos”和“tensorflow-metal”工作

python [test tensorflow] test tensorflow installation #tensorflow

关于tensorflow的显存占用问题

java调用tensorflow训练好的模型

tensorflow新手必看，tensorflow入门教程，tensorflow示例代码

tensorflow 如何在线训练模型