cs-231n-back propagation-5

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了cs-231n-back propagation-5相关的知识,希望对你有一定的参考价值。

Unintuitive effects and their consequences. Notice that if one of the inputs to the multiply gate is very small and the other is very big, then the multiply gate will do something slightly unintuitive: it will assign a relatively huge gradient to the small input and a tiny gradient to the large input. Note that in linear classifiers where the weights are dot producted 技术分享技术分享技术分享技术分享wTxi (multiplied) with the inputs, this implies that the scale of the data has an effect on the magnitude of the gradient for the weights. For example, if you multiplied all input data examples 技术分享技术分享xi by 1000 during preprocessing, then the gradient on the weights will be 1000 times larger, and you’d have to lower the learning rate by that factor to compensate. This is why preprocessing matters a lot, sometimes in subtle ways! And having intuitive understanding for how the gradients flow can help you debug some of these cases.

 

学习vertorized:Erik Learned-Miller has also written up a longer related document on taking matrix/vector derivatives which you might find helpful. Find it here.

以上是关于cs-231n-back propagation-5的主要内容,如果未能解决你的问题,请参考以下文章

cnn.py cs231n

CS231n课程笔记翻译

cs231n 学习笔记 by qscqesze

[转] 贺完结!CS231n官方笔记授权翻译总集篇发布

optim.py cs231n

CS231n:基础知识