计算CNN实现中的卷积层

Posted 2023-03-12

技术标签:

【中文标题】计算CNN实现中的卷积层【英文标题】：Calculate convolutional layer in CNN implementation 【发布时间】：2015-07-23 12:21:28 【问题描述】：

我正在尝试使用稀疏自动编码器训练卷积神经网络，以便计算卷积层的过滤器。我正在使用 UFLDL 代码来构建补丁和训练 CNN 网络。我的代码如下：

===========================================================================
imageDim = 30;         % image dimension
imageChannels = 3;     % number of channels (rgb, so 3)

patchDim = 10;          % patch dimension
numPatches = 100000;    % number of patches

visibleSize = patchDim * patchDim * imageChannels;  % number of input units 
outputSize = visibleSize;   % number of output units
hiddenSize = 400;           % number of hidden units 

epsilon = 0.1;         % epsilon for ZCA whitening

poolDim = 10;          % dimension of pooling region

optTheta =  zeros(2*hiddenSize*visibleSize+hiddenSize+visibleSize, 1);
ZCAWhite =  zeros(visibleSize, visibleSize);
meanPatch = zeros(visibleSize, 1);

load patches_16_1
===========================================================================

% Display and check to see that the features look good
W = reshape(optTheta(1:visibleSize * hiddenSize), hiddenSize, visibleSize);
b =     optTheta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);

displayColorNetwork( (W*ZCAWhite));

stepSize = 100; 
assert(mod(hiddenSize, stepSize) == 0, stepSize should divide hiddenSize);

load train.mat % loads numTrainImages, trainImages, trainLabels
load train.mat  % loads numTestImages,  testImages,  testLabels
% size 30x30x3x8862

numTestImages = 8862;
numTrainImages = 8862;

pooledFeaturesTrain = zeros(hiddenSize, numTrainImages, floor((imageDim -     patchDim + 1) / poolDim), floor((imageDim - patchDim + 1) / poolDim) );
pooledFeaturesTest = zeros(hiddenSize, numTestImages, ...
floor((imageDim - patchDim + 1) / poolDim), ...
floor((imageDim - patchDim + 1) / poolDim) );

 tic();

 testImages = trainImages;

for convPart = 1:(hiddenSize / stepSize)

 featureStart = (convPart - 1) * stepSize + 1;
 featureEnd = convPart * stepSize;

  fprintf('Step %d: features %d to %d\n', convPart, featureStart, featureEnd);  
  Wt = W(featureStart:featureEnd, :);
  bt = b(featureStart:featureEnd);    

  fprintf('Convolving and pooling train images\n');
  convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...
    trainImages, Wt, bt, ZCAWhite, meanPatch);
  pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);
  pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;   
  toc();
  clear convolvedFeaturesThis pooledFeaturesThis;

  fprintf('Convolving and pooling test images\n');
  convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...
    testImages, Wt, bt, ZCAWhite, meanPatch);
  pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);
  pooledFeaturesTest(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;   
  toc();

  clear convolvedFeaturesThis pooledFeaturesThis;

 end

我在计算卷积层和池化层时遇到问题。我得到 pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;下标分配尺寸不匹配。路径已正常计算，它们是：

我试图了解 convPart 变量到底在做什么以及 pooledFeaturesThis 是什么。其次我注意到我的问题是这一行不匹配pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis; 我收到变量不匹配的消息。 pooledFeaturesThis 的大小为 100x3x2x2，其中 pooledFeaturesTrain 的大小为 400x8862x2x2。 pooledFeaturesTrain 究竟代表什么？每个过滤器的结果都是 2x2 吗？ CnnConvolve 可以在here 找到：

编辑：我对我的代码进行了一些更改，并且可以正常工作。但是我有点担心代码的理解。

【问题讨论】：

所以目前代码正在运行，您想更好地理解它吗？是这个问题吗？基本上是 pooledFeaturesTest 和 pooledFeaturesTrain 我计算的特征用于测试和训练？ 【参考方案1】：

好的，所以在这一行中，您正在设置池区域。

poolDim = 10;          % dimension of pooling region

这部分意味着对于每一层中的每个内核，您正在获取图像和池化以及 10x10 像素的区域。从您的代码看来，您正在应用一个均值函数，这意味着它是一个补丁并计算均值并将其输出到下一层……也就是将图像从 100x100 获取到 10x10。在你的网络中，你正在重复卷积+池化，直到你得到一个 2x2 的图像，基于这个输出（顺便说一句，根据我的经验，这通常不是好的做法）。

400x8862x2x2

无论如何，回到你的代码。请注意，在训练开始时，您会进行以下初始化：

 pooledFeaturesTrain = zeros(hiddenSize, numTrainImages, floor((imageDim -     patchDim + 1) / poolDim), floor((imageDim - patchDim + 1) / poolDim) );

所以你的错误非常简单和正确 - 保存卷积+池化输出的矩阵的大小不是你初始化的矩阵的大小。

现在的问题是如何解决它。我认为一个懒惰的人解决它的方法是取出初始化。它会大大减慢您的代码，并且如果您有超过 1 层，则不能保证工作。

我建议您将 pooledFeaturesTrain 改为 3 维数组的结构。所以代替这个

pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;

你会做更多的事情：

pooledFeaturesTrainn(:, :, :) = pooledFeaturesThis;

其中 n 是当前层。

CNN 网络并不像他们想象的那么容易——即使它们没有崩溃，让它们训练好也是一项壮举。我强烈建议阅读 CNN 的理论——它将使编码和调试变得更加容易。

祝你好运！ :)

【讨论】：

以上是关于计算CNN实现中的卷积层的主要内容，如果未能解决你的问题，请参考以下文章