用于不失真图像的 TensorFlow 函数

Posted 2023-02-22

技术标签:

【中文标题】用于不失真图像的 TensorFlow 函数【英文标题】：Tensorflow function for undistorting images 【发布时间】：2018-01-15 13:06:29 【问题描述】：

有人知道任何用于不失真图像的 Tensorflow 代码吗？（去除鱼眼效果）

目前，我正在使用 OpenCV 对图像进行不失真处理。但是，我想将该代码推送到网络中。是否有用于执行此操作的开源代码或 Tensorflow 函数？我无法通过谷歌找到任何东西。

【问题讨论】：

我认为 TensorFlow 没有类似的东西。不过，您也许可以自己实现该算法。 This question 在 OpenCV 中的鱼眼相机模型上可能会阐明它们的特定实现，它还提到了其他模型。是的。我的意思是我自己看了一下源代码。我只是希望我不必走那条路。 【参考方案1】：

给你。我破解了 spatial_transformer 代码。我认为，如果您的相机矩阵不是身份相机矩阵，您必须在运行此代码之前将这些点乘以 K^-1（逆相机矩阵），然后再将它们乘以 K。

    def distort(images, d, name='distort'):
        def _repeat(x, n_repeats):
            with tf.variable_scope('_repeat'):
                rep = tf.transpose(
                tf.expand_dims(tf.ones(shape=tf.stack([n_repeats, ])), 1), [1, 0])
                rep = tf.cast(rep, 'int32')
                x = tf.matmul(tf.reshape(x, (-1, 1)), rep)
                return tf.reshape(x, [-1])

        def _interpolate(im, x, y, out_size):
            with tf.variable_scope('_interpolate'):
                # constants
                num_batch = tf.shape(im)[0]
                height = tf.shape(im)[1]
                width = tf.shape(im)[2]
                channels = tf.shape(im)[3]

                x = tf.cast(x, 'float32')
                y = tf.cast(y, 'float32')
                height_f = tf.cast(height, 'float32')
                width_f = tf.cast(width, 'float32')
                out_height = out_size[0]
                out_width = out_size[1]
                zero = tf.zeros([], dtype='int32')
                max_y = tf.cast(tf.shape(im)[1] - 1, 'int32')
                max_x = tf.cast(tf.shape(im)[2] - 1, 'int32')

                # scale indices from [-1, 1] to [0, width/height]
                x = (x + 1.0)*(width_f) / 2.0
                y = (y + 1.0)*(height_f) / 2.0

                # do sampling
                x0 = tf.cast(tf.floor(x), 'int32')
                x1 = x0 + 1
                y0 = tf.cast(tf.floor(y), 'int32')
                y1 = y0 + 1

                x0 = tf.clip_by_value(x0, zero, max_x)
                x1 = tf.clip_by_value(x1, zero, max_x)
                y0 = tf.clip_by_value(y0, zero, max_y)
                y1 = tf.clip_by_value(y1, zero, max_y)
                dim2 = width
                dim1 = width*height
                base = _repeat(tf.range(num_batch)*dim1, out_height*out_width)
                base_y0 = base + y0*dim2
                base_y1 = base + y1*dim2
                idx_a = base_y0 + x0
                idx_b = base_y1 + x0
                idx_c = base_y0 + x1
                idx_d = base_y1 + x1

                # use indices to lookup pixels in the flat image and restore
                # channels dim
                im_flat = tf.reshape(im, tf.stack([-1, channels]))
                im_flat = tf.cast(im_flat, 'float32')
                Ia = tf.gather(im_flat, idx_a)
                Ib = tf.gather(im_flat, idx_b)
                Ic = tf.gather(im_flat, idx_c)
                Id = tf.gather(im_flat, idx_d)

                # and finally calculate interpolated values
                x0_f = tf.cast(x0, 'float32')
                x1_f = tf.cast(x1, 'float32')
                y0_f = tf.cast(y0, 'float32')
                y1_f = tf.cast(y1, 'float32')
                wa = tf.expand_dims(((x1_f-x) * (y1_f-y)), 1)
                wb = tf.expand_dims(((x1_f-x) * (y-y0_f)), 1)
                wc = tf.expand_dims(((x-x0_f) * (y1_f-y)), 1)
                wd = tf.expand_dims(((x-x0_f) * (y-y0_f)), 1)
                output = tf.add_n([wa*Ia, wb*Ib, wc*Ic, wd*Id])
                return output

            def _transform(images, d, out_size):
                with tf.variable_scope('_transform'): 

                    shape = tf.shape(images)
                    num_batch = tf.shape(images)[0]
                    num_channels = images.get_shape()[3]

                    out_width = out_size[1]
                    out_height = out_size[0]
                    cx = fx = fy = tf.to_float(out_width) / 2
                    cy = tf.to_float(out_height) / 2
                    x = tf.linspace(-1., 1., out_width)
                    y = tf.linspace(-1., 1., out_height)
                    x, y = tf.meshgrid(x, y)
                    x = tf.tile(tf.reshape(x, [1, -1, 1]), [num_batch,1,1])
                    y = tf.tile(tf.reshape(y, [1, -1, 1]), [num_batch,1,1])

                    a = x 
                    b = y 

                    r2 = tf.square(a) + tf.square(b)

                    r = tf.sqrt(r2)
                    r = tf.Print(r, [tf.reduce_min(r), tf.reduce_max(r)], "R min/max: ")
                    theta = tf.atan(r)
                    theta_d = theta*(1.0 + tf.reduce_sum(tf.reshape(d,
                          [1,1,4]) * tf.concat([tf.square(theta),
                          tf.pow(theta, 4), tf.pow(theta, 6), tf.pow(theta, 
                          8)], axis=-1),
                          axis=-1, keepdims=True))
                    tdr = theta_d / r
                    xd = a * tdr
                    yd = b * tdr


                    xd = tf.reshape(xd, [-1])
                    yd = tf.reshape(yd, [-1])

                    input_transformed = _interpolate(
                            images, xd, yd,
                            out_size)
                    output = tf.reshape(input_transformed, 
                             tf.stack([num_batch, out_height, out_width, 
                             num_channels]))
                    return output
            with tf.variable_scope(name):
                output = _transform(images, d, tf.shape(images)[1:3])
                return output

【讨论】：

以上是关于用于不失真图像的 TensorFlow 函数的主要内容，如果未能解决你的问题，请参考以下文章