在 TensorFlow 中计算图上的梯度不会产生类型提取错误

Posted 2023-03-12

技术标签:

【中文标题】在 TensorFlow 中计算图上的梯度不会产生类型提取错误【英文标题】：Computing gradients on a graph in TensorFlow gives none type fetching error 【发布时间】：2018-02-06 09:40:45 【问题描述】：

我正在尝试在下图中计算梯度（这是一个类方法）：

def __define_likelihood_computation(self):

    self.__lik_graph = tf.Graph()
    lik_graph = self.__lik_graph

    r = self.__C(self.__th).shape[1]
    m = self.__H(self.__th).shape[0]
    n = self.__F(self.__th).shape[1]
    p = self.__G(self.__th).shape[1]

    x0_mean = self.__x0_mean
    x0_cov = self.__x0_cov

    with lik_graph.as_default():
        # FIXME: Don't Repeat Yourself (in simulation and here)
        th = tf.placeholder(tf.float64, shape=[None], name='th')
        u = tf.placeholder(tf.float64, shape=[r, None], name='u')
        t = tf.placeholder(tf.float64, shape=[None], name='t')
        y = tf.placeholder(tf.float64, shape=[m, None], name='y')

        N = tf.stack([tf.shape(t)[0]])
        N = tf.reshape(N, ())

        F = tf.py_func(self.__F, [th], tf.float64, name='F')
        F.set_shape([n, n])

        C = tf.py_func(self.__C, [th], tf.float64, name='C')
        C.set_shape([n, r])

        G = tf.py_func(self.__G, [th], tf.float64, name='G')
        G.set_shape([n, p])

        H = tf.py_func(self.__H, [th], tf.float64, name='H')
        H.set_shape([m, n])

        x0_mean = tf.py_func(x0_mean, [th], tf.float64, name='x0_mean')
        x0_mean.set_shape([n, 1])

        P_0 = tf.py_func(x0_cov, [th], tf.float64, name='x0_cov')
        P_0.set_shape([n, n])

        Q = tf.py_func(self.__w_cov, [th], tf.float64, name='w_cov')
        Q.set_shape([p, p])

        R = tf.py_func(self.__v_cov, [th], tf.float64, name='v_cov')
        R.set_shape([m, m])

        I = tf.eye(n, n, dtype=tf.float64)

        def lik_loop_cond(k, P, S, t, u, x, y):
            return tf.less(k, N-1)

        def lik_loop_body(k, P, S, t, u, x, y):

            # TODO: this should be function of time
            u_t_k = tf.slice(u, [0, k], [r, 1])

            # k+1, cause zeroth measurement should not be taken into account
            y_k = tf.slice(y, [0, k+1], [m, 1])

            t_k = tf.slice(t, [k], [2], 't_k')

            # TODO: extract Kalman filter to a separate class
            def state_predict(x, t):
                Fx = tf.matmul(F, x, name='Fx')
                Cu = tf.matmul(C, u_t_k, name='Cu')
                dx = Fx + Cu
                return dx

            def covariance_predict(P, t):
                GQtG = tf.matmul(G @ Q, G, transpose_b=True)
                PtF = tf.matmul(P, F, transpose_b=True)
                dP = tf.matmul(F, P) + PtF + GQtG
                return dP

            x = tf.contrib.integrate.odeint(state_predict, x, t_k,
                                            name='state_predict')
            x = x[-1]

            P = tf.contrib.integrate.odeint(covariance_predict, P, t_k,
                                            name='covariance_predict')
            P = P[-1]

            E = y_k - tf.matmul(H, x)

            B = tf.matmul(H @ P, H, transpose_b=True) + R
            invB = tf.matrix_inverse(B)

            K = tf.matmul(P, H, transpose_b=True) @ invB

            S_k = tf.matmul(E, invB @ E, transpose_a=True)
            S_k = 0.5 * (S_k + tf.log(tf.matrix_determinant(B)))

            S = S + S_k

            # state update
            x = x + tf.matmul(K, E)

            # covariance update
            P = (I - K @ H) @ P

            k = k + 1

            return k, P, S, t, u, x, y

        k = tf.constant(0, name='k')
        P = P_0
        S = tf.constant(0.0, dtype=tf.float64, shape=[1, 1], name='S')
        x = x0_mean

        # TODO: make a named tuple of named list
        lik_loop = tf.while_loop(lik_loop_cond, lik_loop_body,
                                 [k, P, S, t, u, x, y], name='lik_loop')

        dS = tf.gradients(lik_loop[2], th)

        self.__lik_loop_op = lik_loop
        self.__dS = dS

评估本身如下：

def dL(self, t, u, y, th=None):
    if th is None:
        th = self.__th

    self.__validate(th)
    g = self.__lik_graph

    if t.shape[0] != u.shape[1]:
        raise Exception('''t.shape[0] != u.shape[1]''')

    # run lik graph
    with tf.Session(graph=g) as sess:
        t_ph = g.get_tensor_by_name('t:0')
        th_ph = g.get_tensor_by_name('th:0')
        u_ph = g.get_tensor_by_name('u:0')
        y_ph = g.get_tensor_by_name('y:0')
        rez = sess.run(self.__dS, th_ph: th, t_ph: t, u_ph: u, y_ph: y)

    return rez

似然计算确实有效，如下：

def lik(self, t, u, y, th=None):
    if th is None:
        th = self.__th

    self.__validate(th)
    g = self.__lik_graph

    if t.shape[0] != u.shape[1]:
        raise Exception('''t.shape[0] != u.shape[1]''')

    # run lik graph
    with tf.Session(graph=g) as sess:
        t_ph = g.get_tensor_by_name('t:0')
        th_ph = g.get_tensor_by_name('th:0')
        u_ph = g.get_tensor_by_name('u:0')
        y_ph = g.get_tensor_by_name('y:0')
        rez = sess.run(self.__lik_loop_op, th_ph: th, t_ph: t, u_ph: u,
                                            y_ph: y)

    N = len(t)
    m = y.shape[0]
    S = rez[2]
    S = S + N*m * 0.5 + np.log(2*math.pi)

    return S

当我尝试计算梯度时（调用dL），我收到以下回溯错误：

回溯（最近一次通话最后）：文件“”，第 1 行，在文件“/home/konstunn/study/research/prod-practice1/report/tf/model.py”，第 446 行，在 dL = m.dL(t, u, y) 文件“/home/konstunn/study/research/prod-practice1/report/tf/model.py”，第 415 行，在 dL rez = sess.run(self.__dS, th_ph: th, t_ph: t, u_ph: u, y_ph: y) 文件“/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py”，第 895 行，运行中 run_metadata_ptr) _run 中的文件“/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py”，第 1109 行 self._graph, fetches, feed_dict_tensor, feed_handles=feed_handles) 文件“/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py”，第 413 行，在 __init__ self._fetch_mapper = _FetchMapper.for_fetch(fetches) for_fetch 中的文件“/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py”，第 233 行返回_ListFetchMapper（获取） __init__ 中的文件“/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py”，第 340 行 self._mappers = [_FetchMapper.for_fetch(fetch) for fetches] 文件“/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py”，第 340 行，在 self._mappers = [_FetchMapper.for_fetch(fetch) for fetches] for_fetch 中的文件“/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py”，第 230 行（获取，类型（获取））） TypeError：获取参数 None 的类型无效

可能是什么原因？

抱歉发了这么久。

【问题讨论】：

【参考方案1】：

我明白了。原因是我的图表中有tf.py_func()，这是在th 上执行的第一个操作，我尝试计算梯度。 tf.py_func() 似乎还有一个限制，没有记录在案 - 可能被认为是显而易见的。

也许我应该报告错误（或提出功能请求）并临时解决问题。

【讨论】：

是的，py_func 不会自动获取渐变。

以上是关于在 TensorFlow 中计算图上的梯度不会产生类型提取错误的主要内容，如果未能解决你的问题，请参考以下文章

实战| 使用Docker实现分布式TensorFlow

在 keras（tensorflow 后端）中计算梯度时出错

梯度下降

Tensorflow 行为：跨多 GPU 的梯度计算

Tensorflow：如何替换或修改渐变？

TensorFlow计算模型—计算图