如何在TensorFlow中向CNN输入多个序列?
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何在TensorFlow中向CNN输入多个序列?相关的知识,希望对你有一定的参考价值。
我是TensorFlow的新手,并通过一个类似于文本分类指南here的示例
我有一个单一序列的工作模型,但我想弄清楚如何让每个观察的两个不同的序列进入模型训练。我可以连接两个序列并将它们作为一个输入,但我希望每个序列都是不同的。
我查看了输入形状参数的文档,这使我尝试将训练数据作为每个序列形状的元组输入(见下文),但似乎没有这样做:
x_train = [x_trainfirst, x_trainlast]
x_val = [x_valfirst, x_vallast]
shape_param = tuple([i.shape[1] for i in x_train])
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 2 arrays: [array([[ 0, 0, 0, ..., 21, 5, 4],
[ 0, 0, 0, ..., 10, 1, 11],
[ 0, 0, 0, ..., 26, 8, 7],
...,
[ 0, 0, 0, ..., 10, 2, 3],
[ 0, 0, 0, ..., 8, 7, ...
任何有关将多个序列运行到CNN的示例/资源的建议或指针都将非常感激!
编辑:显示建议的架构和尺寸:
def sepcnn_model(blocks,
filters,
kernel_size,
embedding_dim,
dropout_rate,
pool_size,
input_shape,
num_classes,
num_features,
use_pretrained_embedding=False,
is_embedding_trainable=False,
embedding_matrix=None):
"""Creates an instance of a separable CNN model.
# Arguments
blocks: int, number of pairs of sepCNN and pooling blocks in the model.
filters: int, output dimension of the layers.
kernel_size: int, length of the convolution window.
embedding_dim: int, dimension of the embedding vectors.
dropout_rate: float, percentage of input to drop at Dropout layers.
pool_size: int, factor by which to downscale input at MaxPooling layer.
input_shape: tuple, shape of input to the model.
num_classes: int, number of output classes.
num_features: int, number of words (embedding input dimension).
use_pretrained_embedding: bool, true if pre-trained embedding is on.
is_embedding_trainable: bool, true if embedding layer is trainable.
embedding_matrix: dict, dictionary with embedding coefficients.
# Returns
A sepCNN model instance.
"""
op_units, op_activation = _get_last_layer_units_and_activation(num_classes)
model = models.Sequential()
# Add embedding layer. If pre-trained embedding is used add weights to the
# embeddings layer and set trainable to input is_embedding_trainable flag.
if use_pretrained_embedding:
model.add(Embedding(input_dim=num_features,
output_dim=embedding_dim,
input_length=input_shape[0],
weights=[embedding_matrix],
trainable=is_embedding_trainable))
else:
model.add(Embedding(input_dim=num_features,
output_dim=embedding_dim,
input_length=input_shape[0]))
for _ in range(blocks-1):
model.add(Dropout(rate=dropout_rate))
model.add(SeparableConv1D(filters=filters,
kernel_size=kernel_size,
activation='relu',
bias_initializer='random_uniform',
depthwise_initializer='random_uniform',
padding='same'))
model.add(SeparableConv1D(filters=filters,
kernel_size=kernel_size,
activation='relu',
bias_initializer='random_uniform',
depthwise_initializer='random_uniform',
padding='same'))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(SeparableConv1D(filters=filters * 2,
kernel_size=kernel_size,
activation='relu',
bias_initializer='random_uniform',
depthwise_initializer='random_uniform',
padding='same'))
model.add(SeparableConv1D(filters=filters * 2,
kernel_size=kernel_size,
activation='relu',
bias_initializer='random_uniform',
depthwise_initializer='random_uniform',
padding='same'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(rate=dropout_rate))
model.add(Dense(op_units, activation=op_activation))
return model
learning_rate=1e-3
epochs=1000
batch_size=128
blocks=2
filters=64
dropout_rate=0.2
embedding_dim=200
kernel_size=3
pool_size=3
# trying to get multiple features
(training_data_fname, training_data_lname, training_labels), (val_data_fname, val_data_lname, val_labels) = tupleData #data
# Verify that validation labels are in the same range as training labels.
num_classes = get_num_classes(train_labels)
unexpected_labels = [v for v in val_labels if v not in range(num_classes)]
# if len(%unaliasexpected_labels):
# raise ValueError('Unexpected label values found in the validation set:'
# ' {unexpected_labels}. Please make sure that the '
# 'labels in the validation set are in the same range '
# 'as training labels.'.format(
# unexpected_labels=unexpected_labels))
# Vectorize texts.
x_trainfirst, x_valfirst, word_index_first = sequence_vectorize(
training_data_fname, val_data_fname)
x_trainlast, x_vallast, word_index_last = sequence_vectorize(
training_data_lname, val_data_lname)
# Number of features will be the embedding input dimension. Add 1 for the
# reserved index 0.
num_features = min(len(word_index_first) + 1, len(word_index_last) + 1, TOP_K)
x_train = [x_trainfirst, x_trainlast]
x_val = [x_valfirst, x_vallast]
shape_param = tuple([i.shape[1] for i in x_train])
# Create model instance.
model = sepcnn_model(blocks=blocks,
filters=filters,
kernel_size=kernel_size,
embedding_dim=embedding_dim,
dropout_rate=dropout_rate,
pool_size=pool_size,
input_shape=shape_param, #x_train.shape[1:],
num_classes=num_classes,
num_features=num_features)
# Compile model with learning parameters.
if num_classes == 2:
loss = 'binary_crossentropy'
else:
loss = 'sparse_categorical_crossentropy'
optimizer = tf.keras.optimizers.Adam(lr=learning_rate)
model.compile(optimizer=optimizer, loss=loss, metrics=['acc'])
# Create callback for early stopping on validation loss. If the loss does
# not decrease in two consecutive tries, stop training.
callbacks = [tf.keras.callbacks.EarlyStopping(
monitor='val_loss', patience=2)]
# Train and validate model.
history = model.fit(
x_train,
train_labels,
epochs=epochs,
callbacks=callbacks,
validation_data=(x_val, val_labels),
verbose=2, # Logs once per epoch.
batch_size=batch_size)
# Print results.
history = history.history
print('Validation accuracy: {acc}, loss: {loss}'.format(
acc=history['val_acc'][-1], loss=history['val_loss'][-1]))
# Save model.
model.save('sepcnn_model.h5')
#return history['val_acc'][-1], history['val_loss'][-1], model
tupleData = (training_data_fname, training_data_lname, training_labels), (val_data_fname, val_data_lname, val_labels)
shape_param = tuple([i.shape[1] for i in x_train])
shape_param
1
shape_param = tuple([i.shape[1] for i in x_train])
shape_param
(17, 22)
print(x_trainfirst.shape)
(50000, 17)
print(x_trainlast.shape)
(50000, 22)
x_trainfirst[0]
array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 18, 2, 21, 5, 4], dtype=int32)
x_trainlast[0]
array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10,
2, 3, 7, 5, 4], dtype=int32)
答案
您可以尝试以下内容。
- 创建一个网络,其输入将是序列和输出将是一些暗淡,让我们假设
32
。现在这个32
dims表示序列的特征。 - 通过同一网络传递其他序列。
- 现在,每个序列都有
32
dims矢量。你可以concatenate
这些功能,并使用这些64
dims矢量作为另一个模型的输入(它可以是简单的前馈网络),其输出将是其目标(两个序列是否相同)。
这样您就可以捕获两个序列之间的关系。
注意:在将序列输入网络之前,请确保两者都具有same
dims。
更多你可以阅读here,它用于找到面孔之间的相似性。
以上是关于如何在TensorFlow中向CNN输入多个序列?的主要内容,如果未能解决你的问题,请参考以下文章
tensorflow训练好的CNN神经网络如何转成lrp_toolbox_master的输入格式?
深度学习之卷积神经网络CNN及tensorflow代码实现示例
TensorFlow实战-TensorFlow实现卷积神经网络CNN-第5章