Caffe Unknown 底部斑点
Posted
技术标签:
【中文标题】Caffe Unknown 底部斑点【英文标题】:Caffe Unknown bottom blob 【发布时间】:2016-08-02 21:42:52 【问题描述】:我正在使用 caffe 框架,我想训练下一个网络:
当我执行下一条命令时:
caffe train --solver solver.prototxt
它抛出的错误:
`F0802 14:31:54.506695 28038 insert_splits.cpp:29] Unknown bottom blob 'image' (layer 'conv1', bottom index 0)
*** Check failure stack trace: ***
@ 0x7ff2941c3f9d google::LogMessage::Fail()
@ 0x7ff2941c5e03 google::LogMessage::SendToLog()
@ 0x7ff2941c3b2b google::LogMessage::Flush()
@ 0x7ff2941c67ee google::LogMessageFatal::~LogMessageFatal()
@ 0x7ff2947cedbe caffe::InsertSplits()
@ 0x7ff2948306de caffe::Net<>::Init()
@ 0x7ff294833a81 caffe::Net<>::Net()
@ 0x7ff29480ce6a caffe::Solver<>::InitTestNets()
@ 0x7ff29480ee85 caffe::Solver<>::Init()
@ 0x7ff29480f19a caffe::Solver<>::Solver()
@ 0x7ff2947f4343 caffe::Creator_SGDSolver<>()
@ 0x40b1a0 (unknown)
@ 0x407373 (unknown)
@ 0x7ff292e40741 __libc_start_main
@ 0x407b79 (unknown)
Abortado (`core' generado)
代码是(train2.prototxt):
name: "xxxxxx"
layer
name: "image"
type: "HDF5Data"
top: "image"
top: "label"
hdf5_data_param
source: "h5a.train.h5.txt"
batch_size: 64
include
phase: TRAIN
layer
name: "conv1"
type: "Convolution"
bottom: "image"
top: "conv1"
param
lr_mult: 1
decay_mult: 1
param
lr_mult: 2
decay_mult: 0
convolution_param
num_output: 96
kernel_size: 11
stride: 4
weight_filler
type: "gaussian"
std: 0.01
bias_filler
type: "constant"
value: 0
layer
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param
local_size: 5
alpha: 0.0001
beta: 0.75
layer
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param
pool: MAX
kernel_size: 3
stride: 2
layer
name: "norm2"
type: "LRN"
bottom: "pool1"
top: "norm2"
lrn_param
local_size: 5
alpha: 0.0001
beta: 0.75
layer
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param
lr_mult: 1
decay_mult: 1
param
lr_mult: 2
decay_mult: 0
convolution_param
num_output: 384
pad: 1
kernel_size: 3
weight_filler
type: "gaussian"
std: 0.01
bias_filler
type: "constant"
value: 0
layer
name: "pool2"
type: "Pooling"
bottom: "conv3"
top: "pool2"
pooling_param
pool: MAX
kernel_size: 3
stride: 2
layer
name: "improd3"
type: "InnerProduct"
bottom: "pool2"
top: "improd3"
param
lr_mult: 1
decay_mult: 1
param
lr_mult: 2
decay_mult: 0
inner_product_param
num_output: 1000
weight_filler
type: "gaussian"
std: 0.01
bias_filler
type: "constant"
value: 0
layer
name: "accuracy"
type: "Accuracy"
bottom: "improd3"
bottom: "label"
top: "accuracy"
include
phase: TEST
layer
name: "loss"
type: "SoftmaxWithLoss"
bottom: "improd3"
bottom: "label"
top: "loss"
solver.prototxt:
net: "train2.prototxt"
test_iter: 100
test_interval: 1000
# lr for fine-tuning should be lower than when starting from scratch
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
# stepsize should also be lower, as we're closer to being done
stepsize: 20000
display: 20
max_iter: 100000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "caffe"
solver_mode: CPU
因为这个问题,我被卡住了,我无法开始网络的训练。
【问题讨论】:
hdf5
数据中存储的密钥是什么?如果您不知道,请执行以下操作:您的 *h5.txt
文件中应该只有一行。打开 Python 控制台,复制该行并分配给 myPath
。做import h5py
。现在执行以下命令:with h5py.File(myPath,'r') as hf: print hf.keys()
。让我知道你得到什么输出。实际上,您应该得到image
作为输出。
@ParagS.Chandakkar 或者,您可以使用带有.h5
文件名的shell 命令h2ls
来获取密钥和更多信息。
@ParagS.Chandakkar 它返回:[u'image', u'label']
@lennin92 然后尝试在源字段中给出完整路径,而不是仅仅给出"h5a.train.h5.txt"
@ParagS.Chandakkar 我修复了它,问题是 int train2.prototxt 包含层内的元素“图像”include phase: TRAIN
出于某种原因,当我尝试训练时,它的阶段使用的是测试
【参考方案1】:
这是因为,即使您尝试执行Train
阶段,Test
阶段也将用于验证。由于测试阶段没有输入数据层,conv1
层无法找到输入 blob image
。正在调用Test
阶段,因为您在求解器中定义了test_*
参数,并在train2.prototxt 的某些层中定义了phase: TEST
。从求解器中删除上述参数和代表TEST
阶段的层将帮助您顺利运行训练。
【讨论】:
以上是关于Caffe Unknown 底部斑点的主要内容,如果未能解决你的问题,请参考以下文章