shared_x = theano.shared(numpy.asarray(data_x, dtype=theano.config.floatX))这句话啥意思？

Posted 2023-05-15

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了shared_x = theano.shared(numpy.asarray(data_x, dtype=theano.config.floatX))这句话啥意思？相关的知识，希望对你有一定的参考价值。

shared_x = theano.shared(numpy.asarray(data_x, dtype=theano.config.floatX))什么意思呢？
floatX是数据类型吧？为什么是floatX不是 float呢？
asarray什么函数，什么作用？array是向量，as表示什么呢？
求大神详解。

我发现这句话几乎跟我写的一抹一样，变量都一样。我来给你详解：
asarray是python numpy的函数，这里的data_x应该就是机器学习的输入向量，是numpy的array格式，为了把他转成theano的floatX格式，用asarray加上后面的参数dtype改成想要的theano格式。也就是说theano.shared括号里面的基本都是numpy。shared就是把他变成theano的全局变量。
注意：array就是数组，矩阵，asarray是一个函数，这里就是把本来的矩阵该一下dtype重新改成符合theano dype的矩阵。参考技术A 本学习指南不是一份机器学习的教程，但是首先我们会对其中的概念做一个简单的回顾，以确保我们在相同的起跑线上。大家还需要下载几个数据库，以便于跑这个指南里面的程序。
theano下载安装
在学习每一个算法的时候，大家都需要下载安装相应的文件，如果你想要一次下载所有的文件，可以通过下面这种方式
git clone git://github.com/lisa-lab/DeepLearningTutorials.git
数据库
MNIST数据集（mnist.pkl.gz）
MNIST数据集由手写的数字的图像组成，它分为了60,000训练数据和10,000个测试数据。在很多文献以及这个指南里面，官方的训练数据又进一步的分成50,000的训练数据和10,000的验证数据，以便于模型参数的选择。所有的图像都做了规范化的处理，每个图像的大小都是28*28.在原始数据中，图像的像素存成常用的灰度图（灰度区间0~255）。
为了方便在python中调用改数据集，我们对其进行了序列化。序列化后的文件包括三个list，训练数据，验证数据和测试数据。list中的每一个元素都是由图像和相应的标注组成的。其中图像是一个784维（28*28）的numpy数组，标注则是一个0-9之间的数字。下面的代码演示了如何使用这个数据集。
import cPickle, gzip, numpy # Load the dataset f = gzip.open('mnist.pkl.gz', 'rb') train_set, valid_set, test_set = cPickle.load(f) f.close()

在使用这个数据集的时候，我们一般把它分成若干minibatch。我们也鼓励你吧数据集存成共享变量，并根据minibatch的索引来访问它。这样做是为了在GPU上运行代码的方便。当复制代码到GPU上时，数据会有很大的重叠。如果你按照程序请求来复制数据，而不是通过共享变量的方式，GPU上面的程序就不会比运行在CPU上面的快。如果你运用theano的共享数据，就使得theano可以通过一个调用复制所有数据到GPU上。（有些说明没翻译，对GPU的原理不是很理解-译者）
到目前为止，数据保存到了一个变量中，minibatch则是这个变量的一系列的切片，它最自然的定义方法是这个切片的位置和大小。在我们的设置汇总，每个块的大小都是固定的，所以函数只要通过切片的位置就可以访问每个minibatch。下面的代码演示了如果存储数据及minibatch。
def shared_dataset(data_xy): """ Function that loads the dataset into shared variables The reason we store our dataset in shared variables is to allow Theano to copy it into the GPU memory (when code is run on GPU). Since copying data into the GPU is slow, copying a minibatch everytime is needed (the default behaviour if the data is not in a shared variable) would lead to a large decrease in performance. """ data_x, data_y = data_xy shared_x = theano.shared(numpy.asarray(data_x, dtype=theano.config.floatX)) shared_y = theano.shared(numpy.asarray(data_y, dtype=theano.config.floatX)) # When storing data on the GPU it has to be stored as floats # therefore we will store the labels as ``floatX`` as well # (``shared_y`` does exactly that). But during our computations # we need them as ints (we use labels as index, and if they are # floats it doesn't make sense) therefore instead of returning # ``shared_y`` we will have to cast it to int. This little hack # lets us get around this issue return shared_x, T.cast(shared_y, 'int32') test_set_x, test_set_y = shared_dataset(test_set) valid_set_x, valid_set_y = shared_dataset(valid_set) train_set_x, train_set_y = shared_dataset(train_set) batch_size = 500 # size of the minibatch # accessing the third minibatch of the training set data = train_set_x[2 * 500: 3 * 500] label = train_set_y[2 * 500: 3 * 500]

符号
数据集符号
首先，我们用 `\mathbfD`来表示数据集，为了区分的方便，训练，验证和测试数据可以分别用`\mathbfD_train`，`\mathbfD_valid`， `\mathbfD_test`来表示。

这个行么，打了好长时间啊，希望我的回答可以帮到你！

Theano 教程中的说明

【中文标题】Theano 教程中的说明【英文标题】：Clarification in the Theano tutorial 【发布时间】：2014-10-11 13:50:09 【问题描述】：

我正在阅读home page of Theano documentation上提供的this tutorial

我不确定梯度下降部分给出的代码。

我对 for 循环有疑问。

如果您将 'param_update' 变量初始化为零。

param_update = theano.shared(param.get_value()*0., broadcastable=param.broadcastable)

然后在剩下的两行中更新它的值。

updates.append((param, param - learning_rate*param_update))
updates.append((param_update, momentum*param_update + (1. - momentum)*T.grad(cost, param)))

我们为什么需要它？

我想我在这里弄错了。你们能帮帮我吗！

【问题讨论】：

'and you dun 在剩下的两行中更新它的值是什么？什么意思？能否请您添加代码而不是屏幕截图？这里的梯度下降部分：nbviewer.ipython.org/github/craffel/theano-tutorial/blob/master/… 我的意思是你在我提供的第一个代码行中初始化 param_update，而你不需要在上面给出的剩余两个代码行中更新。下次我会尝试添加代码！ 【参考方案1】：

使用theano.shared(.) 初始化param_update 只告诉Theano 保留一个将由Theano 函数使用的变量。此初始化代码只调用一次，以后将不会使用将param_update的值重置为0。

param_update的实际值会根据最后一行更新

updates.append((param_update, momentum*param_update + (1. - momentum)*T.grad(cost, param)))

当 train 函数通过将此更新字典作为参数构造时（本教程中的 [23]）：

train = theano.function([mlp_input, mlp_target], cost,
                        updates=gradient_updates_momentum(cost, mlp.params, learning_rate, momentum))

每次调用train，Theano 都会计算cost w.r.t 的梯度。 param 和 param_update 根据动量规则更新到新的更新方向。然后，param 将按照保存在param_update 中的更新方向进行更新，并带有适当的learning_rate。

【讨论】：

以上是关于shared_x = theano.shared(numpy.asarray(data_x, dtype=theano.config.floatX))这句话啥意思？的主要内容，如果未能解决你的问题，请参考以下文章