为啥我的学习率会下降,即使损失正在改善?
Posted
技术标签:
【中文标题】为啥我的学习率会下降,即使损失正在改善?【英文标题】:Why does my learning rate decrease, even when loss is improving?为什么我的学习率会下降,即使损失正在改善? 【发布时间】:2020-10-08 09:31:59 【问题描述】:我正在 Google Colab TPU 上训练我的 Keras 模型,如下所示 -
adam = Adam(lr=0.002)
model.compile(loss='mse', metrics=[PSNRLoss, SSIMLoss], optimizer=adam)
checkpoint = ModelCheckpoint("model_epoch:02d.hdf5", monitor='loss', verbose=1, save_best_only=True,
mode='min')
reduce_lr = ReduceLROnPlateau(monitor='loss', factor=0.5,
patience=5, min_lr=0.00002)
csv_logger = CSVLogger('history.log')
callbacks_list = [checkpoint,reduce_lr,csv_logger]
model.fit(traindb, batch_size=1024,
callbacks=callbacks_list,shuffle=True,epochs=100, verbose=2, validation_data = validdb)
在训练期间,我的学习率降低了 0.5 倍,即使损失随着当前学习率的值而改善。正如您在下面的 sn-p 中看到的,学习率从 0.0020 下降到 0.0010 到 0.0005。
Epoch 00011: loss improved from 0.00647 to 0.00646, saving model to ./models_x4/no_noise/dcscn_x2_11.hdf5
1939/1939 - 109s - PSNRLoss: 23.7280 - loss: 0.0065 - SSIMLoss: 0.3329 - val_PSNRLoss: 23.9022 - val_loss: 0.0066 - val_SSIMLoss: 0.3815 - lr: 0.0020
Epoch 12/100
Epoch 00012: loss improved from 0.00646 to 0.00645, saving model to ./models_x4/no_noise/dcscn_x2_12.hdf5
1939/1939 - 111s - PSNRLoss: 23.7245 - loss: 0.0065 - SSIMLoss: 0.3331 - val_PSNRLoss: 23.9397 - val_loss: 0.0066 - val_SSIMLoss: 0.3705 - lr: 0.0020
Epoch 13/100
Epoch 00013: loss improved from 0.00645 to 0.00644, saving model to ./models_x4/no_noise/dcscn_x2_13.hdf5
1939/1939 - 110s - PSNRLoss: 23.7300 - loss: 0.0064 - SSIMLoss: 0.3321 - val_PSNRLoss: 23.9827 - val_loss: 0.0065 - val_SSIMLoss: 0.3745 - lr: 0.0020
Epoch 14/100
Epoch 00014: loss improved from 0.00644 to 0.00643, saving model to ./models_x4/no_noise/dcscn_x2_14.hdf5
1939/1939 - 111s - PSNRLoss: 23.7279 - loss: 0.0064 - SSIMLoss: 0.3376 - val_PSNRLoss: 23.9079 - val_loss: 0.0066 - val_SSIMLoss: 0.3959 - lr: 0.0020
Epoch 15/100
Epoch 00015: loss improved from 0.00643 to 0.00634, saving model to ./models_x4/no_noise/dcscn_x2_15.hdf5
1939/1939 - 110s - PSNRLoss: 23.8356 - loss: 0.0063 - SSIMLoss: 0.3408 - val_PSNRLoss: 23.7063 - val_loss: 0.0067 - val_SSIMLoss: 0.3799 - lr: 0.0010
Epoch 16/100
Epoch 00016: loss did not improve from 0.00634
1939/1939 - 107s - PSNRLoss: 23.8173 - loss: 0.0063 - SSIMLoss: 0.3398 - val_PSNRLoss: 23.7282 - val_loss: 0.0067 - val_SSIMLoss: 0.3853 - lr: 0.0010
Epoch 17/100
Epoch 00017: loss did not improve from 0.00634
1939/1939 - 110s - PSNRLoss: 23.8199 - loss: 0.0063 - SSIMLoss: 0.3426 - val_PSNRLoss: 23.7202 - val_loss: 0.0067 - val_SSIMLoss: 0.4082 - lr: 0.0010
Epoch 18/100
Epoch 00018: loss did not improve from 0.00634
1939/1939 - 110s - PSNRLoss: 23.8138 - loss: 0.0063 - SSIMLoss: 0.3393 - val_PSNRLoss: 23.7523 - val_loss: 0.0066 - val_SSIMLoss: 0.4037 - lr: 0.0010
Epoch 19/100
Epoch 00019: loss improved from 0.00634 to 0.00634, saving model to ./models_x4/no_noise/dcscn_x2_19.hdf5
1939/1939 - 110s - PSNRLoss: 23.8189 - loss: 0.0063 - SSIMLoss: 0.3406 - val_PSNRLoss: 23.7188 - val_loss: 0.0067 - val_SSIMLoss: 0.4115 - lr: 0.0010
Epoch 20/100
Epoch 00020: loss improved from 0.00634 to 0.00634, saving model to ./models_x4/no_noise/dcscn_x2_20.hdf5
1939/1939 - 108s - PSNRLoss: 23.8176 - loss: 0.0063 - SSIMLoss: 0.3407 - val_PSNRLoss: 23.7692 - val_loss: 0.0066 - val_SSIMLoss: 0.3883 - lr: 0.0010
Epoch 21/100
Epoch 00021: loss improved from 0.00634 to 0.00627, saving model to ./models_x4/no_noise/dcscn_x2_21.hdf5
1939/1939 - 108s - PSNRLoss: 23.8889 - loss: 0.0063 - SSIMLoss: 0.3478 - val_PSNRLoss: 24.0306 - val_loss: 0.0064 - val_SSIMLoss: 0.3544 - lr: 5.0000e-04
Epoch 22/100
Epoch 00022: loss improved from 0.00627 to 0.00627, saving model to ./models_x4/no_noise/dcscn_x2_22.hdf5
1939/1939 - 109s - PSNRLoss: 23.8847 - loss: 0.0063 - SSIMLoss: 0.3466 - val_PSNRLoss: 24.0461 - val_loss: 0.0064 - val_SSIMLoss: 0.3679 - lr: 5.0000e-04
谢谢你的期待:) 请建议我哪里出错了?我应该监控其他一些合适的值吗?
【问题讨论】:
【参考方案1】:ReduceLROnPlateau
对象有一个名为min_delta
的参数,它是衡量新最优值的阈值。 min_delta
的默认值为0.0001
。因此,尽管您的日志输出表明损失有所改善,但如果它小于min_delta
,则可以避免这种改善。因此,在patience
epochs 之后,学习率会降低。
【讨论】:
以上是关于为啥我的学习率会下降,即使损失正在改善?的主要内容,如果未能解决你的问题,请参考以下文章