Tensorflow keras fit - 准确性和损失都急剧增加
Posted
技术标签:
【中文标题】Tensorflow keras fit - 准确性和损失都急剧增加【英文标题】:Tensorflow keras fit - accuracy and loss both increasing drastically 【发布时间】:2020-09-11 00:13:56 【问题描述】:ubuntu - 20.04
张量流 2.2
使用的数据集 = MNIST
我正在测试 tensorflow,我注意到验证 sparse_categorical_accuracy
(准确性)和验证 SparseCategoricalCrossentropy
(损失)都在增加,这对我来说没有意义。我认为随着训练的进行,验证损失应该会下降并且验证准确度会增加。或者,在过度拟合的情况下,验证损失增加并且验证准确度下降。但是,随着训练的进行,验证损失和验证准确率都在增加。然而,训练计划正在按照预期进行,即训练损失下降,训练准确率上升
这是代码和输出:
#testing without preprocess monsoon
import tensorflow as tf
from tensorflow import keras as k
from tensorflow.keras import layers as l
import tensorflow_addons as tfa
mnist = tf.keras.datasets.mnist
(x_t,y_t),(x_te,y_te) = mnist.load_data()
x_t = x_t.reshape(60000,-1)
x_te = x_te.reshape(10000,-1)
d_x_t = tf.data.Dataset.from_tensor_slices(x_t)
d_y_t = tf.data.Dataset.from_tensor_slices(y_t)
dataset = tf.data.Dataset.zip((d_x_t,d_y_t)).shuffle(1000).batch(32)
d_x_te = tf.data.Dataset.from_tensor_slices(x_te)
d_y_te = tf.data.Dataset.from_tensor_slices(y_te)
dataset_test = tf.data.Dataset.zip((d_x_te,d_y_te)).shuffle(1000,seed=42).batch(32)
inp = k.Input((784,))
x = l.BatchNormalization()(inp)
x1 = l.Dense(1024,activation='relu',name='dense_1')(x)
x1=l.Dropout(0.5)(x1)
x1 = l.BatchNormalization()(x1)
x2 = l.Dense(512,activation='relu',name='dense_2')(x1)
x3 = l.Dense(512,activation='relu',name='dense_3')(x)
x = x3+x2
x=l.Dropout(0.5)(x)
x = l.BatchNormalization()(x)
x = l.Dense(10,activation='relu',name='dense_4')(x)
predictions = l.Dense(10,activation=None,name='preds')(x)
model = k.Model(inputs=inp,outputs=predictions)
opt=tfa.optimizers.MovingAverage(
k.optimizers.Adam(),
True,
0.99,
None,
'MovingAverage',
clipnorm=5
)
model.compile(optimizer=opt,
loss=k.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['sparse_categorical_accuracy'])
print('# Fit model on training data')
history = model.fit(dataset,
epochs=30,
steps_per_epoch=1875,
validation_data = dataset_test,
validation_steps = 313)
print('\nhistory dict:', history.history)
model.evaluate(dataset_test,batch_size=32,steps=331)
我得到的学习进化是:
# Fit model on training data
Epoch 1/30
WARNING:tensorflow:From /home/nitin/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
1875/1875 [==============================] - 49s 26ms/step - loss: 0.3614 - sparse_categorical_accuracy: 0.8913 - val_loss: 0.3355 - val_sparse_categorical_accuracy: 0.9548
Epoch 2/30
1875/1875 [==============================] - 49s 26ms/step - loss: 0.1899 - sparse_categorical_accuracy: 0.9427 - val_loss: 1.2028 - val_sparse_categorical_accuracy: 0.9641
Epoch 3/30
1875/1875 [==============================] - 51s 27ms/step - loss: 0.1546 - sparse_categorical_accuracy: 0.9521 - val_loss: 1.6385 - val_sparse_categorical_accuracy: 0.9673
Epoch 4/30
1875/1875 [==============================] - 38s 20ms/step - loss: 0.1357 - sparse_categorical_accuracy: 0.9585 - val_loss: 2.8285 - val_sparse_categorical_accuracy: 0.9697
Epoch 5/30
1875/1875 [==============================] - 38s 20ms/step - loss: 0.1253 - sparse_categorical_accuracy: 0.9608 - val_loss: 3.8489 - val_sparse_categorical_accuracy: 0.9697
Epoch 6/30
1875/1875 [==============================] - 29s 16ms/step - loss: 0.1149 - sparse_categorical_accuracy: 0.9646 - val_loss: 2.1872 - val_sparse_categorical_accuracy: 0.9699
Epoch 7/30
1875/1875 [==============================] - 29s 16ms/step - loss: 0.1094 - sparse_categorical_accuracy: 0.9646 - val_loss: 2.9429 - val_sparse_categorical_accuracy: 0.9695
Epoch 8/30
1875/1875 [==============================] - 29s 16ms/step - loss: 0.1066 - sparse_categorical_accuracy: 0.9667 - val_loss: 5.6166 - val_sparse_categorical_accuracy: 0.9710
Epoch 9/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0991 - sparse_categorical_accuracy: 0.9688 - val_loss: 3.9547 - val_sparse_categorical_accuracy: 0.9710
Epoch 10/30
1875/1875 [==============================] - 29s 16ms/step - loss: 0.0948 - sparse_categorical_accuracy: 0.9701 - val_loss: 4.8149 - val_sparse_categorical_accuracy: 0.9713
Epoch 11/30
1875/1875 [==============================] - 29s 16ms/step - loss: 0.0850 - sparse_categorical_accuracy: 0.9727 - val_loss: 7.4974 - val_sparse_categorical_accuracy: 0.9712
Epoch 12/30
1875/1875 [==============================] - 29s 16ms/step - loss: 0.0879 - sparse_categorical_accuracy: 0.9719 - val_loss: 4.3669 - val_sparse_categorical_accuracy: 0.9714
Epoch 13/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0817 - sparse_categorical_accuracy: 0.9743 - val_loss: 9.2499 - val_sparse_categorical_accuracy: 0.9725
Epoch 14/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0805 - sparse_categorical_accuracy: 0.9737 - val_loss: 7.5436 - val_sparse_categorical_accuracy: 0.9716
Epoch 15/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0798 - sparse_categorical_accuracy: 0.9751 - val_loss: 14.2331 - val_sparse_categorical_accuracy: 0.9712
Epoch 16/30
1875/1875 [==============================] - 29s 16ms/step - loss: 0.0745 - sparse_categorical_accuracy: 0.9757 - val_loss: 7.9517 - val_sparse_categorical_accuracy: 0.9715
Epoch 17/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0745 - sparse_categorical_accuracy: 0.9761 - val_loss: 7.9719 - val_sparse_categorical_accuracy: 0.9702
Epoch 18/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0741 - sparse_categorical_accuracy: 0.9763 - val_loss: 13.8696 - val_sparse_categorical_accuracy: 0.9665
Epoch 19/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0728 - sparse_categorical_accuracy: 0.9760 - val_loss: 20.2949 - val_sparse_categorical_accuracy: 0.9688
Epoch 20/30
1875/1875 [==============================] - 45s 24ms/step - loss: 0.0699 - sparse_categorical_accuracy: 0.9775 - val_loss: 8.8696 - val_sparse_categorical_accuracy: 0.9713
Epoch 21/30
1875/1875 [==============================] - 29s 16ms/step - loss: 0.0699 - sparse_categorical_accuracy: 0.9777 - val_loss: 12.9682 - val_sparse_categorical_accuracy: 0.9723
Epoch 22/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0674 - sparse_categorical_accuracy: 0.9781 - val_loss: 61.1677 - val_sparse_categorical_accuracy: 0.9692
Epoch 23/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0651 - sparse_categorical_accuracy: 0.9798 - val_loss: 21.3270 - val_sparse_categorical_accuracy: 0.9697
Epoch 24/30
1875/1875 [==============================] - 31s 16ms/step - loss: 0.0624 - sparse_categorical_accuracy: 0.9800 - val_loss: 62.2778 - val_sparse_categorical_accuracy: 0.9685
Epoch 25/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0665 - sparse_categorical_accuracy: 0.9792 - val_loss: 24.9327 - val_sparse_categorical_accuracy: 0.9687
Epoch 26/30
1875/1875 [==============================] - 46s 24ms/step - loss: 0.0605 - sparse_categorical_accuracy: 0.9805 - val_loss: 42.0141 - val_sparse_categorical_accuracy: 0.9700
Epoch 27/30
1875/1875 [==============================] - 29s 16ms/step - loss: 0.0601 - sparse_categorical_accuracy: 0.9806 - val_loss: 54.8586 - val_sparse_categorical_accuracy: 0.9695
Epoch 28/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0583 - sparse_categorical_accuracy: 0.9811 - val_loss: 25.3613 - val_sparse_categorical_accuracy: 0.9680
Epoch 29/30
1875/1875 [==============================] - 29s 16ms/step - loss: 0.0576 - sparse_categorical_accuracy: 0.9811 - val_loss: 23.2299 - val_sparse_categorical_accuracy: 0.9710
Epoch 30/30
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0566 - sparse_categorical_accuracy: 0.9817 - val_loss: 16.5671 - val_sparse_categorical_accuracy: 0.9728
history dict: 'loss': [0.36135926842689514, 0.1898646354675293, 0.15456895530223846, 0.13569727540016174, 0.12525275349617004, 0.1148592159152031, 0.10943067818880081, 0.1066298857331276, 0.09912335127592087, 0.09476170688867569, 0.08501157909631729, 0.0879492461681366, 0.08170024305582047, 0.08047273010015488, 0.07976552098989487, 0.07453753799200058, 0.07450901716947556, 0.07413797080516815, 0.07278618961572647, 0.0698995441198349, 0.06988336145877838, 0.06740442663431168, 0.06507138162851334, 0.06242847815155983, 0.0665266141295433, 0.06050613150000572, 0.06005210056900978, 0.05830719694495201, 0.05763527378439903, 0.05664650723338127], 'sparse_categorical_accuracy': [0.8913000226020813, 0.9427499771118164, 0.9521499872207642, 0.9585333466529846, 0.9607999920845032, 0.9645500183105469, 0.9645666480064392, 0.9666833281517029, 0.9687666893005371, 0.9701166749000549, 0.9726999998092651, 0.9719499945640564, 0.9742666482925415, 0.9736999869346619, 0.9750999808311462, 0.9757000207901001, 0.9760833382606506, 0.9763166904449463, 0.9759833216667175, 0.977483332157135, 0.9777166843414307, 0.9780833125114441, 0.9798333048820496, 0.9800000190734863, 0.9792333245277405, 0.9805499911308289, 0.9805999994277954, 0.9810666441917419, 0.9810666441917419, 0.9816833138465881], 'val_loss': [0.33551061153411865, 1.2028071880340576, 1.6384732723236084, 2.828489065170288, 3.8488738536834717, 2.187160015106201, 2.9428975582122803, 5.6166462898254395, 3.954725503921509, 4.814915657043457, 7.4974141120910645, 4.366909503936768, 9.24986457824707, 7.543578147888184, 14.233136177062988, 7.951717853546143, 7.971870422363281, 13.869564056396484, 20.29490089416504, 8.869643211364746, 12.968180656433105, 61.167701721191406, 21.327049255371094, 62.27778625488281, 24.932708740234375, 42.01411437988281, 54.85857009887695, 25.361297607421875, 23.229896545410156, 16.56712532043457], 'val_sparse_categorical_accuracy': [0.954800009727478, 0.9641000032424927, 0.9672999978065491, 0.9696999788284302, 0.9696999788284302, 0.9699000120162964, 0.9695000052452087, 0.9710000157356262, 0.9710000157356262, 0.9713000059127808, 0.9711999893188477, 0.9714000225067139, 0.9725000262260437, 0.9715999960899353, 0.9711999893188477, 0.9714999794960022, 0.9702000021934509, 0.9664999842643738, 0.9688000082969666, 0.9713000059127808, 0.9722999930381775, 0.9692000150680542, 0.9696999788284302, 0.968500018119812, 0.9686999917030334, 0.9700000286102295, 0.9695000052452087, 0.9679999947547913, 0.9710000157356262, 0.9728000164031982]
302/331 [==========================>...] - ETA: 0s - loss: 17.1192 - sparse_categorical_accuracy: 0.9725WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 331 batches). You may need to use the repeat() function when building your dataset.
313/331 [===========================>..] - 1s 3ms/step - loss: 16.5671 - sparse_categorical_accuracy: 0.9728
[16.567113876342773, 0.9728000164031982]
【问题讨论】:
为什么最后一个密集层没有激活函数。 softmax 不是最好的选择吗? 损失计算不需要softmax应用。只是逻辑。这是 softmax 是最佳选择的主要原因之一。k.losses.SparseCategoricalCrossentropy(from_logits=True)
【参考方案1】:
如果训练损失在减少而验证损失在增加,则很可能您已经过度拟合了模型。
我也对这条线有些疑惑:x = x3+x2
据我了解,您想创建短连接。但在 keras 中,您应该使用 Add
层来执行此操作。
【讨论】:
我明白过度拟合的重点。让我失望的是验证准确性的提高。我什至想不出开始调试的理由。我将尝试通过将x=x2+x3
更改为图层来尝试,但它可能会导致什么不同?
将x=x2+x3
更改为x=tf.keras.layers.Add()([x2,x3])
并没有缓解问题。验证准确率仍在上升,验证损失也在上升
当我在更改模型架构的情况下运行代码时,测试结果为 [0.11449366807937622, 0.9829999804496765]。我所做的是删除第一个 batchnorm 层并删除层级联。在整个训练过程中发现验证损失在 0.1 范围内。以上是关于Tensorflow keras fit - 准确性和损失都急剧增加的主要内容,如果未能解决你的问题,请参考以下文章
为啥 fit_generator 的准确性与 Keras 中的 evaluate_generator 的准确性不同?
为啥在 Keras 中 fit_generator 的准确度、evaluate_generator 的准确度和自定义准确度不同?
使用 Keras 构建了一个模型,该模型报告了良好的准确性,但随后无法进行预测
Keras model.fit log 和 Sklearn.metrics.confusion_matrix 报告的验证准确度指标不匹配