多层神经网络TensorFlow

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了多层神经网络TensorFlow相关的知识,希望对你有一定的参考价值。

我试图训练火车在张量流中的4层神经网络,用于识别字母表。但是我的准确率大约是10%,而我在3层的同一数据集上的准确率是90%。一些迭代的损失也是纳米。我似乎无法找到问题。下面是生成计算图的代码。

batch_size = 128

beta = 0.01
inputs = image_size*image_size
hidden_neurons = [inputs, 1024, 512, 256,]
graph = tf.Graph()
with graph.as_default():
  global_step = tf.Variable(0) 
  # Input data. For the training data, we use a placeholder that will be fed
  # at run time with a training minibatch.
  tf_train_dataset = tf.placeholder(tf.float32,
                                    shape=(batch_size, image_size * image_size))
  tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
  tf_valid_dataset = tf.constant(valid_dataset)
  tf_test_dataset = tf.constant(test_dataset)

  #Hidden Layer Neurons
  tf_hidden_neurons_1 = tf.constant(1024)
  tf_hidden_neurons_2 = tf.constant(512)

  weights_1 = tf.Variable(
  tf.truncated_normal([image_size * image_size, tf_hidden_neurons_1]))
  biases_1 = tf.Variable(tf.zeros([tf_hidden_neurons_1]))
  weights_2 = tf.Variable(
    tf.truncated_normal([tf_hidden_neurons_1, tf_hidden_neurons_2]))
  biases_2 = tf.Variable(tf.zeros([tf_hidden_neurons_2]))
  weights_3 = tf.Variable(
    tf.truncated_normal([tf_hidden_neurons_2, num_labels]))
  biases_3 = tf.Variable(tf.zeros([num_labels]))


  # Training computation.
  reluActivations_1 = tf.nn.relu(tf.matmul(tf_train_dataset, weights_1) + biases_1)
  reluActivations_2 = tf.nn.relu(tf.matmul(reluActivations_1, weights_2) + biases_2)
  logits = tf.matmul(reluActivations_2, weights_3) + biases_3
  loss = tf.reduce_mean(
  tf.nn.softmax_cross_entropy_with_logits(labels=tf_train_labels, logits=logits))

  # Add Regularization
  regulizationTerm = tf.nn.l2_loss(weights_1) + tf.nn.l2_loss(weights_2) +tf.nn.l2_loss(weights_3)
  loss = tf.reduce_mean(loss + beta*regulizationTerm)

  # Optimizer.
  learning_rate = tf.train.exponential_decay(0.3, global_step, 100000, 0.7, staircase=True)
  optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
#   optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)

  # Predictions for the training, validation, and test data.
  train_prediction = tf.nn.softmax(logits)
  valid_prediction = tf.nn.relu(tf.matmul(tf_valid_dataset, weights_1) + biases_1)
  valid_prediction = tf.nn.softmax(tf.matmul(valid_prediction, weights_2) + biases_2)
  test_prediction = tf.nn.relu(tf.matmul(tf_test_dataset, weights_1) + biases_1)
  test_prediction = tf.nn.softmax(tf.matmul(test_prediction, weights_2) + biases_2)

帮助将不胜感激。

答案

我希望你对消失梯度和爆炸梯度问题很熟悉。

在大多数情况下,当渐变太小或者它们爆炸时,你会得到nan。使用np.clip剪裁渐变,你不应该得到任何nan。解决了这个问题后,我们可以继续解决您的低精度问题。

以上是关于多层神经网络TensorFlow的主要内容,如果未能解决你的问题,请参考以下文章

TensorFlow-多层感知机(MLP)

tensorflow学习笔记:实现自编码器

在 tensorflow 中创建多层循环神经网络

Tensorflow实现多层神经网络tf.keras进行fashion_mnist时装分类(完整版)

TensorFlow从0到1之TensorFlow多层感知机实现MINIST分类(22)

TensorFlow从0到1之TensorFlow多层感知机实现MINIST分类(22)