UFLDL教程答案:Exercise:Convolution and Pooling

Posted 2022-12-09 slim1017

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了UFLDL教程答案:Exercise:Convolution and Pooling相关的知识，希望对你有一定的参考价值。

教程地址：http://deeplearning.stanford.edu/wiki/index.php/UFLDL%E6%95%99%E7%A8%8B

练习地址：http://deeplearning.stanford.edu/wiki/index.php/Exercise:Convolution_and_Pooling

1.要点简述

卷积神经网络4个核心点：

局部连接，权重共享，池化，多层。

卷积神经网络结构可以看教程或者

http://blog.csdn.net/zouxy09/article/details/8781543

（1）局部连接与权值共享：

之前使用的mnist数据集28*28小图片使用全连接计算不会太慢，但如果是自然图片98*98，你需要设计近10 的 4 次方（100*100）个输入单元，假设你要学习 100 个特征，那么就有 10 的 6 次方个参数需要去学习。不管是前向传播还是反向传播都会很慢。

所以：每个隐含单元仅仅只能连接输入单元的一部分。例如，每个隐含单元仅仅连接输入图像的一小片相邻区域。这就是局部连接。

局部连接与权值共享不单单是为了减少计算量，这种方式是受人脑视觉神经系统启发设计的。设计原因有俩点：

1.自然图像中局部信息是高度相关的，形成局部motifs(这个词不知道该翻译成什么好，有点只可意会不可言传的感觉)；

2.图像的局部统计特性对位置具有不变性，也就是说如果motifs出现在图像的某个地方，它也可以出现在任何其他地方。所以权值共享可以在图片的各个地方挖掘出相同的模式（一个feature map就挖掘一种模式）。

绿色为原图5*5，黄色为某个卷积核 3*3，红色3*3就是得到的一个feature map，为原图与此卷积核卷积的结果（也可以说原图被提取的特征）

注意有：feature map的3*3=（5-3+1）*（5-3+1）,等式右边5为绿色长宽，3为黄色长宽

（2）池化：

虽然前面已经减少了很多参数，但是计算量还是很大，并且容易出现过拟合 (over-fitting)，所以需要池化。

池化一般是最大值池化和平均值池化。

池化是平移不变性 (translation invariant)的关键。这就意味着即使图像经历了一个小的平移或者形变之后，依然会产生相同的 (池化的) 特征。

（3）网络结构：

输入层：为 64*64的图片，每张图RGB3个通道，64*64的大小，一共2000张训练样本。

卷积层：卷积核为 8*8，由前一个练习训练得到 W(400*192),b(400*1)，其中192=8*8*3，代表8*8小图像块3通道的所有元素的权重，400表示400个不同的卷积核，即最终生成400个不同的feature map(每个大小为57*57，57=64-8+1)。

池化层：400个57*57的feature map被池化成400个3*3。

输出层：softmax层输入为400*3*3，输出4类(airplane, car, cat, dog)。

2.进入正题

Step 2a: Implement convolution

（1）Taking the preprocessing steps into account, the feature activations that you should compute is $\text{[math]}$ , where $T$ is the whitening matrix and $\text{[math]}$ is the mean patch. Expanding this, you obtain $\text{[math]}$ , which suggests that you should convolve the images with $WT$ rather than $W$ as earlier, and you should add $\text{[math]}$ , rather than just $b$ to convolvedFeatures, before finally applying the sigmoid function.

（2）im = squeeze(images(:, :, channel, imageNum));

这句可以看出squeeze函数就是去除维数为1的维度，如这句中channel，imageNum这两个维度维数都为1，被去除。

（3）conv2(im,feature,'valid');

conv2函数第3个参数shape=valid时，卷积时不考虑边界补零，即使输出矩阵维度为(imageDim - patchDim + 1, imageDim - patchDim + 1)

cnnConvolve.m

function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch)
%cnnConvolve Returns the convolution of the features given by W and b with
%the given images
%
% Parameters:
%  patchDim - patch (feature) dimension
%  numFeatures - number of features
%  images - large images to convolve with, matrix in the form
%           images(r, c, channel, image number)
%  W, b - W, b for features from the sparse autoencoder
%  ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for
%                        preprocessing
%
% Returns:
%  convolvedFeatures - matrix of convolved features in the form
%                      convolvedFeatures(featureNum, imageNum, imageRow, imageCol)

numImages = size(images, 4);
imageDim = size(images, 1);
imageChannels = size(images, 3);

convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);

% Instructions:
%   Convolve every feature with every large image here to produce the 
%   numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1) 
%   matrix convolvedFeatures, such that 
%   convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the
%   value of the convolved featureNum feature for the imageNum image over
%   the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1)
%
% Expected running times: 
%   Convolving with 100 images should take less than 3 minutes 
%   Convolving with 5000 images should take around an hour
%   (So to save time when testing, you should convolve with less images, as
%   described earlier)

% -------------------- YOUR CODE HERE --------------------
% Precompute the matrices that will be used during the convolution. Recall
% that you need to take into account the whitening and mean subtraction
% steps
WT=W*ZCAWhite;   %(400*192)*(192*192)   400是hiddenSize代表400种不同的feature map
B=b-WT*meanPatch;%(400*1) 这两步是根据教程step2a末尾的建议
% --------------------------------------------------------
convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);
for imageNum = 1:numImages
  for featureNum = 1:numFeatures

    % convolution of image with feature matrix for each channel
    convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1);
    for channel = 1:3

      % Obtain the feature (patchDim x patchDim) needed during the convolution
      % ---- YOUR CODE HERE ----
      %feature = zeros(8,8); % You should replace this
      temp=patchDim*patchDim;
      feature=reshape(WT(featureNum,1+(channel-1)*temp:channel*temp),patchDim,patchDim);
      % ------------------------

      % Flip the feature matrix because of the definition of convolution, as explained later
      feature = flipud(fliplr(squeeze(feature)));
      
      % Obtain the image
      im = squeeze(images(:, :, channel, imageNum));%squeeze会消除只有1维的维度

      % Convolve "feature" with "im", adding the result to convolvedImage
      % 3通道都卷积后需要求和得到convolvedImage
      % be sure to do a 'valid' convolution
      % ---- YOUR CODE HERE ----
        convolvedChannel=conv2(im,feature,'valid');%shape=valid时，卷积时不考虑边界补零
        convolvedImage=convolvedImage+convolvedChannel;
      % ------------------------

    end
    
    % Subtract the bias unit (correcting for the mean subtraction as well)
    % Then, apply the sigmoid function to get the hidden activation
    % ---- YOUR CODE HERE ----
    convolvedImage=sigmoid(convolvedImage+B(featureNum));  %根据教程step2a末尾的建议

    % ------------------------
    
    % The convolved feature is the sum of the convolved values for all channels
    convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage;
  end
end


end

function sigm = sigmoid(x)
  
    sigm = 1 ./ (1 + exp(-x));
end

Step 2b: Check your convolution

运行自带的测试代码，显示：Congratulations! Your convolution code passed the test. 卷积代码无误！

Step 2c: Pooling

池化代码较为容易，使用卷积中类似for循环即可。

%采用和cnnConvolve.m中的循环顺序 
pooledDim=floor(convolvedDim/poolDim);
for imageNum = 1:numImages
    for featureNum = 1:numFeatures
        for poolRow = 1:pooledDim
            for poolCol = 1:pooledDim
                poolMatrix = convolvedFeatures(featureNum, imageNum, (poolRow-1)*poolDim+1:poolRow*poolDim, (poolCol-1)*poolDim+1:poolCol*poolDim);
                pooledFeatures(featureNum, imageNum, poolRow, poolCol) = mean(poolMatrix(:));
            end
        end
    end
end

测试结果：Congratulations! Your pooling code passed the test.

结果：

（1）后面的代码NG已经给出，设置了stepSize = 50; 是为了避免内存溢出，分400/50=8次来计算，每次算50种feature map。如果电脑运行起来卡，把这个值改小些，改成10，20之类的。

（2）B = permute(A,order) 按照向量order指定的顺序重排A的各维。B中元素和A中元素完全相同。但由于经过重新排列，在A、B访问同一个元素使用的下标就不一样了。order中的元素必须各不相同。 --> A = randn(13,5,7,2); --> size(A) ans = 13 5 7 2 --> B = permute(A,[3,4,2,1]); --> size(B) ans = 7 2 5 13

即各维度调整了顺序，读数据时下标改变了。

（3）注意把需要的文件拷贝过来，程序跑完挺废时间的，所以保存好中间变量：

save('cnnPooledFeatures.mat', 'pooledFeaturesTrain', 'pooledFeaturesTest');

（4）结果：

以上是关于UFLDL教程答案:Exercise:Convolution and Pooling的主要内容，如果未能解决你的问题，请参考以下文章