馈送图像时torch7尺寸不匹配

Posted

技术标签:

【中文标题】馈送图像时torch7尺寸不匹配【英文标题】:torch7 size mismatch when feeding an image 【发布时间】:2017-09-26 10:41:23 【问题描述】:

我正在尝试用torch7 中的神经网络做一些事情。但是,当我运行代码时,我收到错误 /home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:57: size mismatch at /tmp/luarocks_cutorch-scm-1-6477/cutorch/lib/THC/generic/THCTensorMathBlas.cu:52

这是代码(或者至少是出现问题的最小示例)

require 'torch'
require 'nn'
require 'image'
require 'optim'
require 'cutorch'
require 'cunn'

require 'loadcaffe'
local cmd = torch.CmdLine()
local function main(params)
  cutorch.setDevice(1)
  local loadcaffe_backend = 'nn'
  local cnn = loadcaffe.load('models/VGG_ILSVRC_19_layers-deploy.prototxt', 'models/VGG_ILSVRC_19_layers.caffemodel', loadcaffe_backend):float()
  cnn:cuda()
  targetImage_caffe = image.load('tank.jpg', 3)
  targetImage_caffe = targetImage_caffe:cuda() 
 netimage=cnn:forward(targetImage_caffe)

end

local params = cmd:parse(arg)
main(params)

以及完整的错误日志

/home/thijser/torch/install/bin/luajit: /home/thijser/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 39 module of nn.Sequential:
/home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:57: size mismatch at /tmp/luarocks_cutorch-scm-1-6477/cutorch/lib/THC/generic/THCTensorMathBlas.cu:52
stack traceback:
    [C]: in function 'addmv'
    /home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:57: in function </home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:53>
    [C]: in function 'xpcall'
    /home/thijser/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
    /home/thijser/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    temp.lua:24: in function 'main'
    temp.lua:37: in main chunk
    [C]: in function 'dofile'
    ...jser/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x5599d0cfa470

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
    [C]: in function 'error'
    /home/thijser/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
    /home/thijser/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    temp.lua:24: in function 'main'
    temp.lua:37: in main chunk
    [C]: in function 'dofile'
    ...jser/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x5599d0cfa470

模型可以下载

cd models
wget -c https://gist.githubusercontent.com/ksimonyan/3785162f95cd2d5fee77/raw/bb2b4fe0a9bb0669211cf3d0bc949dfdda173e9e/VGG_ILSVRC_19_layers_deploy.prototxt
wget -c --no-check-certificate https://bethgelab.org/media/uploads/deeptextures/vgg_normalised.caffemodel
wget -c http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_19_layers.caffemodel
cd ..

print(cnn) 的输出为

nn.Sequential 
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> (38) -> (39) -> (40) -> (41) -> (42) -> (43) -> (44) -> (45) -> (46) -> output]
  (1): nn.SpatialConvolution(3 -> 64, 3x3, 1,1, 1,1)
  (2): nn.ReLU
  (3): nn.SpatialConvolution(64 -> 64, 3x3, 1,1, 1,1)
  (4): nn.ReLU
  (5): nn.SpatialMaxPooling(2x2, 2,2)
  (6): nn.SpatialConvolution(64 -> 128, 3x3, 1,1, 1,1)
  (7): nn.ReLU
  (8): nn.SpatialConvolution(128 -> 128, 3x3, 1,1, 1,1)
  (9): nn.ReLU
  (10): nn.SpatialMaxPooling(2x2, 2,2)
  (11): nn.SpatialConvolution(128 -> 256, 3x3, 1,1, 1,1)
  (12): nn.ReLU
  (13): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
  (14): nn.ReLU
  (15): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
  (16): nn.ReLU
  (17): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
  (18): nn.ReLU
  (19): nn.SpatialMaxPooling(2x2, 2,2)
  (20): nn.SpatialConvolution(256 -> 512, 3x3, 1,1, 1,1)
  (21): nn.ReLU
  (22): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (23): nn.ReLU
  (24): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (25): nn.ReLU
  (26): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (27): nn.ReLU
  (28): nn.SpatialMaxPooling(2x2, 2,2)
  (29): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (30): nn.ReLU
  (31): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (32): nn.ReLU
  (33): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (34): nn.ReLU
  (35): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (36): nn.ReLU
  (37): nn.SpatialMaxPooling(2x2, 2,2)
  (38): nn.View(-1)
  (39): nn.Linear(25088 -> 4096)
  (40): nn.ReLU
  (41): nn.Dropout(0.500000)
  (42): nn.Linear(4096 -> 4096)
  (43): nn.ReLU
  (44): nn.Dropout(0.500000)
  (45): nn.Linear(4096 -> 1000)
  (46): nn.SoftMax

虽然print(targetImage_caffe:size()) 给了我

    3
  660
 1045
[torch.LongStorage of size 3]

有人知道如何解决这个问题或我做错了什么吗?

【问题讨论】:

你能给我们print(cnn)print(targetImage_caffe:size())的输出吗? @fonfonx 我已经添加了信息。 【参考方案1】:

问题在于您使用的是 VGG19,它被设计为输入 224 x 224 图像。由于您使用的是660 x 1045 图像(这通常很奇怪,因为大多数 convnet 使用平方图像)在模块 39 处发生错误(您可以在堆栈跟踪中看到它),因为您想要应用具有 25088 输入的线性模块一个张量的维度,现在大约有 327680 个值(每个池化层大致将图像的大小除以 4,您有 512 个特征图)。

因此,解决方案是使用224 x 224 图像。因此,在 5 个池化层之后,您将获得尺寸为 (224 / 2^5) x (224 / 2^5) x 512 = 25088 的图像。

【讨论】:

以上是关于馈送图像时torch7尺寸不匹配的主要内容,如果未能解决你的问题,请参考以下文章

强制图像匹配 Tailwind 中父 flexbox 的尺寸

如何使用python检查目录中所有图像的尺寸?

将全尺寸图像的大小和比例与 SVG 视图框匹配

OpenGL:当尺寸不能被 4 整除时,灰度纹理数据布局不匹配

估计新值时卡尔曼滤波器矩阵尺寸不匹配

如何让图像响应屏幕尺寸但不逐渐缩小?