caffe使用自己的数据做分类

Posted hansjorn

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了caffe使用自己的数据做分类相关的知识,希望对你有一定的参考价值。

这里只举一个例子: Alexnet网络训练自己数据的过程

用AlexNet跑自己的数据
参考1:http://blog.csdn.net/gybheroin/article/details/54095399
参考2:http://www.cnblogs.com/alexcai/p/5469436.html
1,准备数据;
在caffe根目录下data文件夹新建一个文件夹,名字自己起一个就行了,我起的名字是food,在food文件夹下新建两个文件夹,分别存放train和val数据,
在train文件夹下存放要分类的数据toast, pizza等,要分几类就建立几个文件夹,分别把对应的图像放进去。(当然,也可以把所有的图像都放在一个文件夹下,只是在标签文件中标明就行)。
./data (food) -> ./data/food (train val) -> ./data/food/train (pizza sandwich 等等) ./data/food/val (pizza sandwich 等等)
然后在food目录下生成建立train.txt和val.txt category.txt
--- train.txt 和val.txt 内容类似为:
toast/62.jpg 0
toast/107.jpg 0
toast/172.jpg 0
pizza/62.jpg 1
pizza/107.jpg 1
pizza/172.jpg 1
--- category.txt内容类似为:
0 toast
1 pizza


注:图片需要分两批:训练集(train)、测试集(test),一般训练集与测试集的比例大概是5:1以上,此外每个分类的图片也不能太少,我这里每个分类大概选了5000张训练图+1000张测试图。

2,lmdb制作(也可以不制作lmdb数据类型,需要在train的配置文件中data layer 的type改为:type: "ImageData" ###可以直接使用图像训练)
编译成功的caffe根目录下bin文件夹下有一个convert_imageset.exe文件,用来转换数据,在food文件夹下新建一个脚本文件create_foodnet.sh,内容参考example/imagenet/create_imagenet.sh

#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
set -e

EXAMPLE=data/food  # the path of generated lmdb data
DATA=data/food  # the txt path of train and test data
TOOLS=build/tools

TRAIN_DATA_ROOT=/path/to/imagenet/train/    # /path/to/imagenet/train/
VAL_DATA_ROOT=/path/to/imagenet/val/

# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=false
if $RESIZE; then
  RESIZE_HEIGHT=256
  RESIZE_WIDTH=256
else
  RESIZE_HEIGHT=0
  RESIZE_WIDTH=0
fi

if [ ! -d "$TRAIN_DATA_ROOT" ]; then
  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \\
       "where the ImageNet training data is stored."
  exit 1
fi

if [ ! -d "$VAL_DATA_ROOT" ]; then
  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \\
       "where the ImageNet validation data is stored."
  exit 1
fi

echo "Creating train lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \\
    --resize_height=$RESIZE_HEIGHT \\
    --resize_width=$RESIZE_WIDTH \\
    --shuffle \\
    $TRAIN_DATA_ROOT \\
    $DATA/train.txt \\
    $EXAMPLE/food_train_lmdb   #生成的lmdb路径

echo "Creating val lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \\
    --resize_height=$RESIZE_HEIGHT \\
    --resize_width=$RESIZE_WIDTH \\
    --shuffle \\
    $VAL_DATA_ROOT \\
    $DATA/val.txt \\
    $EXAMPLE/food_val_lmdb     #生成的lmdb路径

echo "Done."


3,mean_binary生成

下面我们用lmdb生成mean_file,用于训练
EXAMPLE=data/food
DATA=data/food
TOOLS=build/tools
$TOOLS/compute_image_mean $EXAMPLE/food_train_lmdb $DATA/foodnet_mean.binaryproto

4,solver 和train网络修改

------ Solver.prototxt详解:
# 表示网络的测试迭代次数。网络一次迭代将一个batchSize的图片进行测试,
# 所以为了能将validation集中所有图片都测试一次,这个参数乘以TEST的batchSize
# 应该等于validation集中图片总数量。即test_iter*batchSize=val_num。
test_iter: 299  

# 表示网络迭代多少次进行一次测试。一次迭代即一个batchSize的图片通过网络
# 正向传播和反向传播的整个过程。比如这里设置的是224,即网络每迭代224次即
# 对网络的准确率进行一次验证。一般来说,我们需要将训练集中所有图片都跑一
# 编,再对网络的准确率进行测试,整个参数乘以网络data层(TRAIN)中batchSize
# 参数应该等于训练集中图片总数量。即test_interval*batchSize=train_num
test_interval: 224

# 表示网络的基础学习率。学习率过高可能导致loss持续86.33333,也可能导致
# loss无法收敛等等问题。过低的学习率会使网络收敛慢,也有可能导致梯度损失。
# 一般我们设置为0.01  
base_lr: 0.01  
display: 20  
max_iter: 6720  
lr_policy: "step"  
gamma: 0.1  
momentum: 0.9   #动量,上次参数更新的权重
weight_decay: 0.0001  
stepsize: 2218  #每stpesize之后降低学习率
snapshot: 224   # 每多少次保存一次学习的结果。即caffemodel
snapshot_prefix: "food/food_net/food_alex_snapshot"     #快照路径和前缀
solver_mode: GPU  
net: "train_val.prototxt"  # 网络结构的文件路径。
solver_type: SGD  

----- train_val.prototxt 修改
###### Data层为原图像格式。设置主要是data层不同(原图像作为输入)
layer {
  name: "data"
  type: "ImageData" ###注意是ImageData,可以直接使用图像训练
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }

image_data_param { ###
    source: "examples/finetune_myself/train.txt"  ###
    batch_size: 50
    new_height: 256 ###
    new_width: 256 ###
  }
  
##### data层为lmdb格式.(制作的lmdb格式作为输入)
layer {
  name: "data"
  type: "Data" ###这里是data,使用转换为lmdb的图像之后训练
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }

  data_param {  ###
    source: "examples/imagenet/car_train_lmdb"###
    batch_size: 256 
    backend: LMDB ###
  }
  
整个网络结构为:
name: "AlexNet"
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true
    crop_size: 227
    mean_file: "mimg_mean.binaryproto" #均值文件
  }
  data_param {
    source: "mtrainldb"  #训练数据
    batch_size: 256
    backend: LMDB
  }
}
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mirror: false
    crop_size: 227
    mean_file: "mimg_mean.binaryproto"  #均值文件
  }
  data_param {
    source: "mvaldb"   #验证数据
    batch_size: 50
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "conv1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "norm1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "conv2"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "norm2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 2       #注意:这里需要改成你要分成的类的个数
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc8"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8"
  bottom: "label"
  top: "loss"
}

运行以下脚本进行train
#!/usr/bin/env sh
set -e

./build/tools/caffe train \\
    --solver=food/food_alexnet/solver.prototxt
    
5、测试 
同样,测试需要一个类别标签文件,category.txt,文件内容同上,修改deploy.prototxt 开始测试:
./bin/classification "food/foodnet/deploy.prototxt" "food/foodnet/food_iter_100000.caffemodel" "ming_mean.binaryproto" "test001.jpg"

------------------------------------    
---------------- FineTune:
http://www.cnblogs.com/denny402/p/5074212.html
http://www.cnblogs.com/alexcai/p/5469478.html
1,注意finetune的时候,最后一层的连接层的名字需要做修改,类别数需要修改,并且学习率应该比较大,因为只有这层的权值是重新训练的,而其他的都是已经训练好了的
2、开始训练的时候,最后制定的模型为将要finetune的模型
./build/tools/caffe train -solver examples/money_test/fine_tune/solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel
其中model指定的是caffenet训练好的model。

 

以上是关于caffe使用自己的数据做分类的主要内容,如果未能解决你的问题,请参考以下文章

caffe的学习和使用·一」--使用caffe训练自己的数据

如何使用 Spark 和 Caffe 对图像进行分类

使用Caffe进行多级和多标签图像分类

Caffe训练好的网络对图像分类

caffe_ssd学习-用自己的数据做训练

Caffe应用篇----文件格式转换