检查失败:如何在深层使用 hdf5 数据层?

Posted

技术标签:

【中文标题】检查失败:如何在深层使用 hdf5 数据层?【英文标题】:Check fail: how to use hdf5 data layer in deep layer? 【发布时间】:2016-03-30 23:40:26 【问题描述】:

我的火车和标签数据为data.mat。 (我有 200 个训练数据,其中包含 6000 个特征,标签是 (-1, +1) 已保存在 data.mat 中)。

我正在尝试在hdf5 中转换我的数据(训练和测试)并使用以下方法运行 Caffe:

load input.mat
hdf5write('my_data.h5', '/new_train_x', single( permute(reshape(new_train_x,[200, 6000, 1, 1]),[4:-1:1] ) ));
hdf5write('my_data.h5', '/label_train', single( permute(reshape(label_train,[200, 1, 1, 1]), [4:-1:1] ) ) , 'WriteMode', 'append' );
hdf5write('my_data_test.h5', '/test_x', single( permute(reshape(test_x,[77, 6000, 1, 1]),[4:-1:1] ) ));
hdf5write('my_data_test.h5', '/label_test', single( permute(reshape(label_test,[77, 1, 1, 1]), [4:-1:1] ) ) , 'WriteMode', 'append' );

(有关在 Matlab 中将 mat 文件转换为 hdf5,请参阅 this thread)。

我的train_val.prototxt 是:

  layer 
  type: "HDF5Data"
  name: "data"
  top: "new_train_x"     # note: same name as in HDF5
  top: "label_train"     # 
  hdf5_data_param 
    source: "file.txt"
    batch_size: 20
  
  include  phase: TRAIN 

layer 
  type: "HDF5Data"
  name: "data"
  top: "test_x"     # note: same name as in HDF5
  top: "label_test"     # 
  hdf5_data_param 
    source: "file_test.txt"
    batch_size: 20
  
  include  phase:TEST 


layer 
  name: "ip1"
  type: "InnerProduct"
  bottom: "new_train_x"
  top: "ip1"
  param 
    lr_mult: 1
  
  param 
    lr_mult: 2
  
  inner_product_param 
    num_output: 30
    weight_filler 
      type: "gaussian" # initialize the filters from a Gaussian
      std: 0.01 
    
    bias_filler 
      type: "constant"
    
  

layer 
  name: "tanh1"
  type: "TanH"
  bottom: "ip1"
  top: "tanh1"


layer 
  name: "ip2"
  type: "InnerProduct"
  bottom: "tanh1"
  top: "ip2"
  param 
    lr_mult: 1
  
  param 
    lr_mult: 2
  
  inner_product_param 
    num_output: 1
    weight_filler 
      type: "gaussian" # initialize the filters from a Gaussian
      std: 0.01 
    
    bias_filler 
      type: "constant"
    
  

layer 
  name: "loss"
  type: "TanH"
  bottom: "ip2"
  bottom: "label_train"
  top: "loss"

但是我有一个问题。看来,它无法读取我的输入数据。

I1227 10:27:21.880826  7186 layer_factory.hpp:76] Creating layer data
I1227 10:27:21.880851  7186 net.cpp:110] Creating Layer data
I1227 10:27:21.880866  7186 net.cpp:433] data -> new_train_x
I1227 10:27:21.880893  7186 net.cpp:433] data -> label_train
I1227 10:27:21.880915  7186 hdf5_data_layer.cpp:81] Loading list of HDF5 filenames from: file.txt
I1227 10:27:21.880965  7186 hdf5_data_layer.cpp:95] Number of HDF5 files: 1
I1227 10:27:21.962596  7186 net.cpp:155] Setting up data
I1227 10:27:21.962702  7186 net.cpp:163] Top shape: 20 6000 1 1 (120000)
I1227 10:27:21.962738  7186 net.cpp:163] Top shape: 20 1 1 1 (20)
I1227 10:27:21.962772  7186 layer_factory.hpp:76] Creating layer ip1
I1227 10:27:21.962838  7186 net.cpp:110] Creating Layer ip1
I1227 10:27:21.962873  7186 net.cpp:477] ip1 <- new_train_x
I1227 10:27:21.962918  7186 net.cpp:433] ip1 -> ip1
I1227 10:27:21.979375  7186 net.cpp:155] Setting up ip1
I1227 10:27:21.979434  7186 net.cpp:163] Top shape: 20 30 (600)
I1227 10:27:21.979478  7186 layer_factory.hpp:76] Creating layer tanh1
I1227 10:27:21.979529  7186 net.cpp:110] Creating Layer tanh1
I1227 10:27:21.979557  7186 net.cpp:477] tanh1 <- ip1
I1227 10:27:21.979583  7186 net.cpp:433] tanh1 -> tanh1
I1227 10:27:21.979620  7186 net.cpp:155] Setting up tanh1
I1227 10:27:21.979650  7186 net.cpp:163] Top shape: 20 30 (600)
I1227 10:27:21.979670  7186 layer_factory.hpp:76] Creating layer ip2
I1227 10:27:21.979696  7186 net.cpp:110] Creating Layer ip2
I1227 10:27:21.979720  7186 net.cpp:477] ip2 <- tanh1
I1227 10:27:21.979746  7186 net.cpp:433] ip2 -> ip2
I1227 10:27:21.979796  7186 net.cpp:155] Setting up ip2
I1227 10:27:21.979825  7186 net.cpp:163] Top shape: 20 1 (20)
I1227 10:27:21.979854  7186 layer_factory.hpp:76] Creating layer loss
I1227 10:27:21.979881  7186 net.cpp:110] Creating Layer loss
I1227 10:27:21.979909  7186 net.cpp:477] loss <- ip2
I1227 10:27:21.979931  7186 net.cpp:477] loss <- label_train
I1227 10:27:21.979962  7186 net.cpp:433] loss -> loss
F1227 10:27:21.980006  7186 layer.hpp:374] Check failed: ExactNumBottomBlobs() == bottom.size() (1 vs. 2) TanH Layer takes 1 bottom blob(s) as input.
*** Check failure stack trace: ***
    @     0x7f44cbc68ea4  (unknown)
    @     0x7f44cbc68deb  (unknown)
    @     0x7f44cbc687bf  (unknown)
    @     0x7f44cbc6ba35  (unknown)
    @     0x7f44cbfd0ba8  caffe::Layer<>::CheckBlobCounts()
    @     0x7f44cbfed9da  caffe::Net<>::Init()
    @     0x7f44cbfef108  caffe::Net<>::Net()
    @     0x7f44cc03f71a  caffe::Solver<>::InitTrainNet()
    @     0x7f44cc040a51  caffe::Solver<>::Init()
    @     0x7f44cc040db9  caffe::Solver<>::Solver()
    @           0x41222d  caffe::GetSolver<>()
    @           0x408ed9  train()
    @           0x406741  main
    @     0x7f44ca997a40  (unknown)
    @           0x406f69  _start
    @              (nil)  (unknown)
Aborted (core dumped)

现在,如果我像这样更改损失层:

layer 
  name: "loss"
  type: "TanH"
  bottom: "ip2"
  top: "loss"

我有这个问题:

F1227 10:53:17.884419  9102 insert_splits.cpp:35] Unknown bottom blob 'new_train_x' (layer 'ip1', bottom index 0)
*** Check failure stack trace: ***
    @     0x7f502ab5dea4  (unknown)
    @     0x7f502ab5ddeb  (unknown)
    @     0x7f502ab5d7bf  (unknown)
    @     0x7f502ab60a35  (unknown)
    @     0x7f502af1d75b  caffe::InsertSplits()
    @     0x7f502aee19e9  caffe::Net<>::Init()
    @     0x7f502aee4108  caffe::Net<>::Net()
    @     0x7f502af35172  caffe::Solver<>::InitTestNets()
    @     0x7f502af35abd  caffe::Solver<>::Init()
    @     0x7f502af35db9  caffe::Solver<>::Solver()
    @           0x41222d  caffe::GetSolver<>()
    @           0x408ed9  train()
    @           0x406741  main
    @     0x7f502988ca40  (unknown)
    @           0x406f69  _start
    @              (nil)  (unknown)
Aborted (core dumped)

非常感谢!!!!任何建议将不胜感激!

【问题讨论】:

【参考方案1】:

您的数据层仅为phase: TRAIN 定义我相信当caffe 尝试构建测试时网络(即phase: TEST 网络)时会发生错误。 您应该有一个包含测试数据的附加层:

layer 
  type: "HDF5Data"
  name: "data"
  top: "new_train_x"     # note: same name as in HDF5
  top: "label_train"     # 
  hdf5_data_param 
    source: "test_file.txt"
    batch_size: 20
  
  include  phase: TEST  # do not forget TEST phase

顺便说一句,如果您不想在训练期间测试您的网络,您可以关闭此选项。请参阅this thread 了解更多信息。


更新: 请原谅我直言不讳,但你把事情搞得一团糟。

    "TanH" 不是损失层 - 它是神经元/激活层。它用作应用于线性层(conv/inner-product)的非线性。因此,它接受单个输入(底部 blob)并输出单个 blob(顶部)。 损失层计算标量损失值,通常需要两个输入:预测和地面实况进行比较。 您确实更改了网络并为TEST 阶段添加了"HDF5Data" 层,但是该层输出top: "test_x",您的网络中没有层需要bottom: "test_x" 您只有层需要@987654330 @..."label_text" 也是如此。

我建议您使用更通用的名称(例如,xlabel)重写您的 hdf5 文件,以便 训练和测试。只需使用不同的文件名来区分它们。这样,您的网络在两个阶段都可以使用 "x""label",并且只根据阶段加载适当的数据集。

【讨论】:

我定义了测试阶段,但也没有工作,我仍然有同样的问题。我:( @ahmadnavidghanizadeh 请发布您的 prototxt 和整个日志 非常感谢您的时间和关注亲爱的 shai。

以上是关于检查失败:如何在深层使用 hdf5 数据层?的主要内容,如果未能解决你的问题,请参考以下文章

如何在 Python 中查找 HDF5 文件组/键?

如何检查文件是不是为有效的 HDF5 文件?

如何使用 h5py 通过 szip 压缩访问 HDF5 数据集

如何使用 C++ 库在 HDF5 中找出数据集的 PredType

如何在 C 中将动态分配的 3D 数组写入 hdf5 文件?

数据类型类:H5T_FLOAT F0413 08:54:40.661201 17769 hdf5_data_layer.cpp:53] 检查失败:hdf_blobs_[i] ->shape(0)