通过时间反向传播，理解

Posted 2023-02-16

技术标签:

【中文标题】通过时间反向传播，理解【英文标题】：Back propagation through time, understanding 【发布时间】：2017-01-05 23:27:22 【问题描述】：

Back_Propagation_Through_Time(a, y)   // a[t] is the input at time t. y[t] is the output
Unfold the network to contain k instances of f
do until stopping criteria is met:
    x = the zero-magnitude vector;// x is the current context
    for t from 0 to n - 1         // t is time. n is the length of the training sequence
        Set the network inputs to x, a[t], a[t+1], ..., a[t+k-1]
        p = forward-propagate the inputs over the whole unfolded network
        e = y[t+k] - p;           // error = target - prediction
        Back-propagate the error, e, back across the whole unfolded network
        Update all the weights in the network
        Average the weights in each instance of f together, so that each f is identical
        x = f(x);                 // compute the context for the next time-step

嘿，

我不明白上面算法的概念，我们是不是创建了一个神经网络 f（k 个副本）的 k 个实例，然后将 a[t] 作为输入和 x 作为输入传递，什么是 x = f(x )?

感谢您的帮助

【问题讨论】：

【参考方案1】：

我们是否创建了 k 个神经网络实例

有点。在循环神经网络中，某些输入 x_i 的网络输出取决于它之前的每个 x_i-j。因此，当您使用长度为 k 的输入调用网络时，网络可以有效地“展开”成 k 个网络，每个网络按顺序相互馈送。

展开后，循环神经网络看起来很像带有隐藏层的传统神经网络。我们可以使用反向传播算法来分配错误并更新我们的权重。

什么是 x = f(x)？

“上下文”或x 就像神经网络的“记忆”。这就像随时间变化的输出，具体取决于您所处的循环网络的迭代。它在开始时被初始化为全零，因为没有内存。我们使用x = f(x) 计算它，因为前一层的输出构成下一层输入的一部分（另一部分是 a[t_i]）

【讨论】：

以上是关于通过时间反向传播，理解的主要内容，如果未能解决你的问题，请参考以下文章