如何在TensorFlow中向CNN输入多个序列?

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何在TensorFlow中向CNN输入多个序列?相关的知识,希望对你有一定的参考价值。

我是TensorFlow的新手,并通过一个类似于文本分类指南here的示例

我有一个单一序列的工作模型,但我想弄清楚如何让每个观察的两个不同的序列进入模型训练。我可以连接两个序列并将它们作为一个输入,但我希望每个序列都是不同的。

我查看了输入形状参数的文档,这使我尝试将训练数据作为每个序列形状的元组输入(见下文),但似乎没有这样做:

x_train = [x_trainfirst, x_trainlast]
x_val = [x_valfirst, x_vallast]
shape_param = tuple([i.shape[1] for i in x_train])

ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 2 arrays: [array([[ 0,  0,  0, ..., 21,  5,  4],
   [ 0,  0,  0, ..., 10,  1, 11],
   [ 0,  0,  0, ..., 26,  8,  7],
   ..., 
   [ 0,  0,  0, ..., 10,  2,  3],
   [ 0,  0,  0, ...,  8,  7,  ...

任何有关将多个序列运行到CNN的示例/资源的建议或指针都将非常感激!

编辑:显示建议的架构和尺寸:

def sepcnn_model(blocks,
                 filters,
                 kernel_size,
                 embedding_dim,
                 dropout_rate,
                 pool_size,
                 input_shape,
                 num_classes,
                 num_features,
                 use_pretrained_embedding=False,
                 is_embedding_trainable=False,
                 embedding_matrix=None):
    """Creates an instance of a separable CNN model.

    # Arguments
        blocks: int, number of pairs of sepCNN and pooling blocks in the model.
        filters: int, output dimension of the layers.
        kernel_size: int, length of the convolution window.
        embedding_dim: int, dimension of the embedding vectors.
        dropout_rate: float, percentage of input to drop at Dropout layers.
        pool_size: int, factor by which to downscale input at MaxPooling layer.
        input_shape: tuple, shape of input to the model.
        num_classes: int, number of output classes.
        num_features: int, number of words (embedding input dimension).
        use_pretrained_embedding: bool, true if pre-trained embedding is on.
        is_embedding_trainable: bool, true if embedding layer is trainable.
        embedding_matrix: dict, dictionary with embedding coefficients.

    # Returns
        A sepCNN model instance.
    """
    op_units, op_activation = _get_last_layer_units_and_activation(num_classes)
    model = models.Sequential()

    # Add embedding layer. If pre-trained embedding is used add weights to the
    # embeddings layer and set trainable to input is_embedding_trainable flag.
    if use_pretrained_embedding:
        model.add(Embedding(input_dim=num_features,
                            output_dim=embedding_dim,
                            input_length=input_shape[0],
                            weights=[embedding_matrix],
                            trainable=is_embedding_trainable))
    else:
        model.add(Embedding(input_dim=num_features,
                            output_dim=embedding_dim,
                            input_length=input_shape[0]))

    for _ in range(blocks-1):
        model.add(Dropout(rate=dropout_rate))
        model.add(SeparableConv1D(filters=filters,
                                  kernel_size=kernel_size,
                                  activation='relu',
                                  bias_initializer='random_uniform',
                                  depthwise_initializer='random_uniform',
                                  padding='same'))
        model.add(SeparableConv1D(filters=filters,
                                  kernel_size=kernel_size,
                                  activation='relu',
                                  bias_initializer='random_uniform',
                                  depthwise_initializer='random_uniform',
                                  padding='same'))
        model.add(MaxPooling1D(pool_size=pool_size))

    model.add(SeparableConv1D(filters=filters * 2,
                              kernel_size=kernel_size,
                              activation='relu',
                              bias_initializer='random_uniform',
                              depthwise_initializer='random_uniform',
                              padding='same'))
    model.add(SeparableConv1D(filters=filters * 2,
                              kernel_size=kernel_size,
                              activation='relu',
                              bias_initializer='random_uniform',
                              depthwise_initializer='random_uniform',
                              padding='same'))
    model.add(GlobalAveragePooling1D())
    model.add(Dropout(rate=dropout_rate))
    model.add(Dense(op_units, activation=op_activation))
    return model

learning_rate=1e-3
epochs=1000
batch_size=128
blocks=2
filters=64
dropout_rate=0.2
embedding_dim=200
kernel_size=3
pool_size=3

# trying to get multiple features
(training_data_fname, training_data_lname, training_labels), (val_data_fname, val_data_lname, val_labels) = tupleData #data

# Verify that validation labels are in the same range as training labels.
num_classes = get_num_classes(train_labels)
unexpected_labels = [v for v in val_labels if v not in range(num_classes)]

# if len(%unaliasexpected_labels):
#     raise ValueError('Unexpected label values found in the validation set:'
#                      ' {unexpected_labels}. Please make sure that the '
#                      'labels in the validation set are in the same range '
#                      'as training labels.'.format(
#                          unexpected_labels=unexpected_labels))

# Vectorize texts.
x_trainfirst, x_valfirst, word_index_first = sequence_vectorize(
        training_data_fname, val_data_fname)

x_trainlast, x_vallast, word_index_last = sequence_vectorize(
        training_data_lname, val_data_lname)
# Number of features will be the embedding input dimension. Add 1 for the
# reserved index 0.
num_features = min(len(word_index_first) + 1, len(word_index_last) + 1, TOP_K)

x_train = [x_trainfirst, x_trainlast]
x_val = [x_valfirst, x_vallast]

shape_param = tuple([i.shape[1] for i in x_train])


# Create model instance.
model = sepcnn_model(blocks=blocks,
                                 filters=filters,
                                 kernel_size=kernel_size,
                                 embedding_dim=embedding_dim,
                                 dropout_rate=dropout_rate,
                                 pool_size=pool_size,
                                 input_shape=shape_param, #x_train.shape[1:],
                                 num_classes=num_classes,
                                 num_features=num_features)

# Compile model with learning parameters.
if num_classes == 2:
    loss = 'binary_crossentropy'
else:
    loss = 'sparse_categorical_crossentropy'
optimizer = tf.keras.optimizers.Adam(lr=learning_rate)
model.compile(optimizer=optimizer, loss=loss, metrics=['acc'])

# Create callback for early stopping on validation loss. If the loss does
# not decrease in two consecutive tries, stop training.
callbacks = [tf.keras.callbacks.EarlyStopping(
    monitor='val_loss', patience=2)]

# Train and validate model.
history = model.fit(
        x_train,
        train_labels,
        epochs=epochs,
        callbacks=callbacks,
        validation_data=(x_val, val_labels),
        verbose=2,  # Logs once per epoch.
        batch_size=batch_size)

# Print results.
history = history.history
print('Validation accuracy: {acc}, loss: {loss}'.format(
        acc=history['val_acc'][-1], loss=history['val_loss'][-1]))

# Save model.
model.save('sepcnn_model.h5')
#return history['val_acc'][-1], history['val_loss'][-1], model

tupleData = (training_data_fname, training_data_lname, training_labels), (val_data_fname, val_data_lname, val_labels)

shape_param = tuple([i.shape[1] for i in x_train])
shape_param
1
shape_param = tuple([i.shape[1] for i in x_train])
shape_param
(17, 22)

print(x_trainfirst.shape)
(50000, 17)
print(x_trainlast.shape)
(50000, 22)

x_trainfirst[0]
array([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 18,  2, 21,  5,  4], dtype=int32)

x_trainlast[0]
array([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 10,
        2,  3,  7,  5,  4], dtype=int32)
答案

您可以尝试以下内容。

  1. 创建一个网络,其输入将是序列和输出将是一些暗淡,让我们假设32。现在这个32 dims表示序列的特征。
  2. 通过同一网络传递其他序列。
  3. 现在,每个序列都有32 dims矢量。你可以concatenate这些功能,并使用这些64 dims矢量作为另一个模型的输入(它可以是简单的前馈网络),其输出将是其目标(两个序列是否相同)。

这样您就可以捕获两个序列之间的关系。

注意:在将序列输入网络之前,请确保两者都具有same dims。

更多你可以阅读here,它用于找到面孔之间的相似性。

以上是关于如何在TensorFlow中向CNN输入多个序列?的主要内容,如果未能解决你的问题,请参考以下文章

tensorflow训练好的CNN神经网络如何转成lrp_toolbox_master的输入格式?

深度学习之卷积神经网络CNN及tensorflow代码实现示例

TensorFlow实战-TensorFlow实现卷积神经网络CNN-第5章

根据输入形状的计算是不是存在差异? (带有 Tensorflow 的 Python 中的 CNN)

在 Keras 中将变压器输出连接到 CNN 输入的问题

如何在 Tensorflow 中为 BERT 标记器指定输入序列长度?