Deep RL Bootcamp Lecture 7: SVG, DDPG, and Stochastic Computation Graphs
Posted ecoflex
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Deep RL Bootcamp Lecture 7: SVG, DDPG, and Stochastic Computation Graphs相关的知识,希望对你有一定的参考价值。
^ is the square root of epsilon
a simplified version of hard version
a more smooth way to find correct solution
the first term is the REINFORCE term, and the seconde term is our grad log probability of our loss
b is a stochastic node
more formula derivations are ignored.
以上是关于Deep RL Bootcamp Lecture 7: SVG, DDPG, and Stochastic Computation Graphs的主要内容,如果未能解决你的问题,请参考以下文章
Deep RL Bootcamp Lecture 8 Derivative Free Methods
Deep RL Bootcamp Lecture 4B Policy Gradients Revisited
Deep RL Bootcamp Lecture 5: Natural Policy Gradients, TRPO, PPO
Deep RL Bootcamp Lecture 7: SVG, DDPG, and Stochastic Computation Graphs
Deep RL Bootcamp Lecture 2: Sampling-based Approximations and Function Fitting