caffe使用自己的数据做分类
Posted hansjorn
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了caffe使用自己的数据做分类相关的知识,希望对你有一定的参考价值。
这里只举一个例子: Alexnet网络训练自己数据的过程
用AlexNet跑自己的数据 参考1:http://blog.csdn.net/gybheroin/article/details/54095399 参考2:http://www.cnblogs.com/alexcai/p/5469436.html 1,准备数据; 在caffe根目录下data文件夹新建一个文件夹,名字自己起一个就行了,我起的名字是food,在food文件夹下新建两个文件夹,分别存放train和val数据, 在train文件夹下存放要分类的数据toast, pizza等,要分几类就建立几个文件夹,分别把对应的图像放进去。(当然,也可以把所有的图像都放在一个文件夹下,只是在标签文件中标明就行)。 ./data (food) -> ./data/food (train val) -> ./data/food/train (pizza sandwich 等等) ./data/food/val (pizza sandwich 等等) 然后在food目录下生成建立train.txt和val.txt category.txt --- train.txt 和val.txt 内容类似为: toast/62.jpg 0 toast/107.jpg 0 toast/172.jpg 0 pizza/62.jpg 1 pizza/107.jpg 1 pizza/172.jpg 1 --- category.txt内容类似为: 0 toast 1 pizza 注:图片需要分两批:训练集(train)、测试集(test),一般训练集与测试集的比例大概是5:1以上,此外每个分类的图片也不能太少,我这里每个分类大概选了5000张训练图+1000张测试图。 2,lmdb制作(也可以不制作lmdb数据类型,需要在train的配置文件中data layer 的type改为:type: "ImageData" ###可以直接使用图像训练) 编译成功的caffe根目录下bin文件夹下有一个convert_imageset.exe文件,用来转换数据,在food文件夹下新建一个脚本文件create_foodnet.sh,内容参考example/imagenet/create_imagenet.sh #!/usr/bin/env sh # Create the imagenet lmdb inputs # N.B. set the path to the imagenet train + val data dirs set -e EXAMPLE=data/food # the path of generated lmdb data DATA=data/food # the txt path of train and test data TOOLS=build/tools TRAIN_DATA_ROOT=/path/to/imagenet/train/ # /path/to/imagenet/train/ VAL_DATA_ROOT=/path/to/imagenet/val/ # Set RESIZE=true to resize the images to 256x256. Leave as false if images have # already been resized using another tool. RESIZE=false if $RESIZE; then RESIZE_HEIGHT=256 RESIZE_WIDTH=256 else RESIZE_HEIGHT=0 RESIZE_WIDTH=0 fi if [ ! -d "$TRAIN_DATA_ROOT" ]; then echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT" echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \\ "where the ImageNet training data is stored." exit 1 fi if [ ! -d "$VAL_DATA_ROOT" ]; then echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT" echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \\ "where the ImageNet validation data is stored." exit 1 fi echo "Creating train lmdb..." GLOG_logtostderr=1 $TOOLS/convert_imageset \\ --resize_height=$RESIZE_HEIGHT \\ --resize_width=$RESIZE_WIDTH \\ --shuffle \\ $TRAIN_DATA_ROOT \\ $DATA/train.txt \\ $EXAMPLE/food_train_lmdb #生成的lmdb路径 echo "Creating val lmdb..." GLOG_logtostderr=1 $TOOLS/convert_imageset \\ --resize_height=$RESIZE_HEIGHT \\ --resize_width=$RESIZE_WIDTH \\ --shuffle \\ $VAL_DATA_ROOT \\ $DATA/val.txt \\ $EXAMPLE/food_val_lmdb #生成的lmdb路径 echo "Done." 3,mean_binary生成 下面我们用lmdb生成mean_file,用于训练 EXAMPLE=data/food DATA=data/food TOOLS=build/tools $TOOLS/compute_image_mean $EXAMPLE/food_train_lmdb $DATA/foodnet_mean.binaryproto 4,solver 和train网络修改 ------ Solver.prototxt详解: # 表示网络的测试迭代次数。网络一次迭代将一个batchSize的图片进行测试, # 所以为了能将validation集中所有图片都测试一次,这个参数乘以TEST的batchSize # 应该等于validation集中图片总数量。即test_iter*batchSize=val_num。 test_iter: 299 # 表示网络迭代多少次进行一次测试。一次迭代即一个batchSize的图片通过网络 # 正向传播和反向传播的整个过程。比如这里设置的是224,即网络每迭代224次即 # 对网络的准确率进行一次验证。一般来说,我们需要将训练集中所有图片都跑一 # 编,再对网络的准确率进行测试,整个参数乘以网络data层(TRAIN)中batchSize # 参数应该等于训练集中图片总数量。即test_interval*batchSize=train_num test_interval: 224 # 表示网络的基础学习率。学习率过高可能导致loss持续86.33333,也可能导致 # loss无法收敛等等问题。过低的学习率会使网络收敛慢,也有可能导致梯度损失。 # 一般我们设置为0.01 base_lr: 0.01 display: 20 max_iter: 6720 lr_policy: "step" gamma: 0.1 momentum: 0.9 #动量,上次参数更新的权重 weight_decay: 0.0001 stepsize: 2218 #每stpesize之后降低学习率 snapshot: 224 # 每多少次保存一次学习的结果。即caffemodel snapshot_prefix: "food/food_net/food_alex_snapshot" #快照路径和前缀 solver_mode: GPU net: "train_val.prototxt" # 网络结构的文件路径。 solver_type: SGD ----- train_val.prototxt 修改 ###### Data层为原图像格式。设置主要是data层不同(原图像作为输入) layer { name: "data" type: "ImageData" ###注意是ImageData,可以直接使用图像训练 top: "data" top: "label" include { phase: TRAIN } image_data_param { ### source: "examples/finetune_myself/train.txt" ### batch_size: 50 new_height: 256 ### new_width: 256 ### } ##### data层为lmdb格式.(制作的lmdb格式作为输入) layer { name: "data" type: "Data" ###这里是data,使用转换为lmdb的图像之后训练 top: "data" top: "label" include { phase: TRAIN } data_param { ### source: "examples/imagenet/car_train_lmdb"### batch_size: 256 backend: LMDB ### } 整个网络结构为: name: "AlexNet" layer { name: "data" type: "Data" top: "data" top: "label" include { phase: TRAIN } transform_param { mirror: true crop_size: 227 mean_file: "mimg_mean.binaryproto" #均值文件 } data_param { source: "mtrainldb" #训练数据 batch_size: 256 backend: LMDB } } layer { name: "data" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { mirror: false crop_size: 227 mean_file: "mimg_mean.binaryproto" #均值文件 } data_param { source: "mvaldb" #验证数据 batch_size: 50 backend: LMDB } } layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool2" type: "Pooling" bottom: "norm2" top: "pool2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv3" type: "Convolution" bottom: "pool2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "fc6" type: "InnerProduct" bottom: "pool5" top: "fc6" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu6" type: "ReLU" bottom: "fc6" top: "fc6" } layer { name: "drop6" type: "Dropout" bottom: "fc6" top: "fc6" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc7" type: "InnerProduct" bottom: "fc6" top: "fc7" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu7" type: "ReLU" bottom: "fc7" top: "fc7" } layer { name: "drop7" type: "Dropout" bottom: "fc7" top: "fc7" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc8" type: "InnerProduct" bottom: "fc7" top: "fc8" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 2 #注意:这里需要改成你要分成的类的个数 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "accuracy" type: "Accuracy" bottom: "fc8" bottom: "label" top: "accuracy" include { phase: TEST } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "fc8" bottom: "label" top: "loss" } 运行以下脚本进行train #!/usr/bin/env sh set -e ./build/tools/caffe train \\ --solver=food/food_alexnet/solver.prototxt 5、测试 同样,测试需要一个类别标签文件,category.txt,文件内容同上,修改deploy.prototxt 开始测试: ./bin/classification "food/foodnet/deploy.prototxt" "food/foodnet/food_iter_100000.caffemodel" "ming_mean.binaryproto" "test001.jpg" ------------------------------------ ---------------- FineTune: http://www.cnblogs.com/denny402/p/5074212.html http://www.cnblogs.com/alexcai/p/5469478.html 1,注意finetune的时候,最后一层的连接层的名字需要做修改,类别数需要修改,并且学习率应该比较大,因为只有这层的权值是重新训练的,而其他的都是已经训练好了的 2、开始训练的时候,最后制定的模型为将要finetune的模型 ./build/tools/caffe train -solver examples/money_test/fine_tune/solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel 其中model指定的是caffenet训练好的model。
以上是关于caffe使用自己的数据做分类的主要内容,如果未能解决你的问题,请参考以下文章