caffe学习--cifar10学习-ubuntu16.04-gtx650tiboost--1g--01

Posted 2020-10-11 leoking01

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了caffe学习--cifar10学习-ubuntu16.04-gtx650tiboost--1g--01相关的知识，希望对你有一定的参考价值。

引用了下文的资料，在此感谢！

http://www.cnblogs.com/alexcai/p/5468164.html

http://blog.csdn.net/garfielder007/article/details/51480844

第一、cifar数据集的知识

The CIFAR-10 dataset

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.

Here are the classes in the dataset, as well as 10 random images from each:

airplane
automobile
bird
cat
deer
dog
frog
horse
ship
truck

The classes are completely mutually exclusive. There is no overlap between automobiles and trucks. "Automobile" includes sedans, SUVs, things of that sort. "Truck" includes only big trucks. Neither includes pickup trucks.

Download

If you\'re going to use this dataset, please cite the tech report at the bottom of this page.

Version	Size	md5sum
CIFAR-10 python version	163 MB	c58f30108f718f92721af3b95e74349a
CIFAR-10 Matlab version	175 MB	70270af85842c9e89bb428ec9976c926
CIFAR-10 binary version (suitable for C programs)	162 MB	c32a1d4ab5d03f1284b67883e8d87530

Baseline results

You can find some baseline replicable results on this dataset on the project page for cuda-convnet. These results were obtained with a convolutional neural network. Briefly, they are 18% test error without data augmentation and 11% with. Additionally, Jasper Snoek has a new paper in which he used Bayesian hyperparameter optimization to find nice settings of the weight decay and other hyperparameters, which allowed him to obtain a test error rate of 15% (without data augmentation) using the architecture of the net that got 18%.

Other results

Rodrigo Benenson has been kind enough to collect results on CIFAR-10/100 and other datasets on his website; click here to view.

Dataset layout

Python / Matlab versions

I will describe the layout of the Python version of the dataset. The layout of the Matlab version is identical.

The archive contains the files data_batch_1, data_batch_2, ..., data_batch_5, as well as test_batch. Each of these files is a Python "pickled" object produced with cPickle. Here is a Python routine which will open such a file and return a dictionary:

def unpickle(file):
    import cPickle
    fo = open(file, \'rb\')
    dict = cPickle.load(fo)
    fo.close()
    return dict

Loaded in this way, each of the batch files contains a dictionary with the following elements:

data -- a 10000x3072 numpy array of uint8s. Each row of the array stores a 32x32 colour image. The first 1024 entries contain the red channel values, the next 1024 the green, and the final 1024 the blue. The image is stored in row-major order, so that the first 32 entries of the array are the red channel values of the first row of the image.
labels -- a list of 10000 numbers in the range 0-9. The number at index i indicates the label of the ith image in the array data.

The dataset contains another file, called batches.meta. It too contains a Python dictionary object. It has the following entries:

label_names -- a 10-element list which gives meaningful names to the numeric labels in the labels array described above. For example, label_names[0] == "airplane",label_names[1] == "automobile", etc.

Binary version

The binary version contains the files data_batch_1.bin, data_batch_2.bin, ..., data_batch_5.bin, as well as test_batch.bin. Each of these files is formatted as follows:

<1 x label><3072 x pixel>
...
<1 x label><3072 x pixel>

In other words, the first byte is the label of the first image, which is a number in the range 0-9. The next 3072 bytes are the values of the pixels of the image. The first 1024 bytes are the red channel values, the next 1024 the green, and the final 1024 the blue. The values are stored in row-major order, so the first 32 bytes are the red channel values of the first row of the image.

Each file contains 10000 such 3073-byte "rows" of images, although there is nothing delimiting the rows. Therefore each file should be exactly 30730000 bytes long.

There is another file, called batches.meta.txt. This is an ASCII file that maps numeric labels in the range 0-9 to meaningful class names. It is merely a list of the 10 class names, one per row. The class name on row i corresponds to numeric label i.

The CIFAR-100 dataset

This dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs).
Here is the list of classes in the CIFAR-100:

Superclass	Classes
aquatic mammals	beaver, dolphin, otter, seal, whale
fish	aquarium fish, flatfish, ray, shark, trout
flowers	orchids, poppies, roses, sunflowers, tulips
food containers	bottles, bowls, cans, cups, plates
fruit and vegetables	apples, mushrooms, oranges, pears, sweet peppers
household electrical devices	clock, computer keyboard, lamp, telephone, television
household furniture	bed, chair, couch, table, wardrobe
insects	bee, beetle, butterfly, caterpillar, cockroach
large carnivores	bear, leopard, lion, tiger, wolf
large man-made outdoor things	bridge, castle, house, road, skyscraper
large natural outdoor scenes	cloud, forest, mountain, plain, sea
large omnivores and herbivores	camel, cattle, chimpanzee, elephant, kangaroo
medium-sized mammals	fox, porcupine, possum, raccoon, skunk
non-insect invertebrates	crab, lobster, snail, spider, worm
people	baby, boy, girl, man, woman
reptiles	crocodile, dinosaur, lizard, snake, turtle
small mammals	hamster, mouse, rabbit, shrew, squirrel
trees	maple, oak, palm, pine, willow
vehicles 1	bicycle, bus, motorcycle, pickup truck, train
vehicles 2	lawn-mower, rocket, streetcar, tank, tractor

Yes, I know mushrooms aren\'t really fruit or vegetables and bears aren\'t really carnivores.

Download

Version	Size	md5sum
CIFAR-100 python version	161 MB	eb9058c3a382ffc7106e4002c42a8d85
CIFAR-100 Matlab version	175 MB	6a4bfa1dcd5c9453dda6bb54194911f4
CIFAR-100 binary version (suitable for C programs)	161 MB	03b5dce01913d631647c71ecec9e9cb8

Dataset layout

Python / Matlab versions

The python and Matlab versions are identical in layout to the CIFAR-10, so I won\'t waste space describing them here.

Binary version

The binary version of the CIFAR-100 is just like the binary version of the CIFAR-10, except that each image has two label bytes (coarse and fine) and 3072 pixel bytes, so the binary files look like this:

<1 x coarse label><1 x fine label><3072 x pixel>
...
<1 x coarse label><1 x fine label><3072 x pixel>

Indices into the original 80 million tiny images dataset

Sivan Sabato was kind enough to provide this file, which maps CIFAR-100 images to images in the 80 million tiny images dataset. Sivan Writes:

The file has 60000 rows, each row contains a single index into the tiny db,
where the first image in the tiny db is indexed "1". "0" stands for an image that is not from the tiny db.
The first 50000 lines correspond to the training set, and the last 10000 lines correspond
to the test set.

Reference

This tech report (Chapter 3) describes the dataset and the methodology followed when collecting it in much greater detail. Please cite it if you intend to use this dataset.

Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009.

from: http://www.cs.toronto.edu/~kriz/cifar.html

第二、caffe的知识

简介

caffe是一个友好、易于上手的开源深度学习平台，主要用于图像的相关处理，可以支持CNN等多种深度学习网络。

基于caffe，开发者可以方便快速地开发简单的学习网络，用于分类、定位等任务，也可以用于科研，在其源码基础上进行修改，实现自己的算法。

本文的主要目的，是介绍caffe的基本使用方法，希望通过本文，能让普通的工程师可以使用caffe训练自己的简单模型。

本文主要包括以下内容：运行caffe的例子训练cifar训练集、使用别人定义好的网络训练自己的数据、使用训练好的模型fine tune自己的数据。

背景知识简介

深度学习是机器学习的一个分支，主要目标在于通过学习的方法，解决以往普通编程无法解决的问题，例如：图像识别、文字识别等等。

机器学习里的“学习”，指通过向程序输入经验数据，通过若干次“迭代”，不断改进算法参数，最终能够获得“模型”，使用新数据输入模型，计算得出想要的结果。

例如图像分类任务中，经验数据是图片和对应的文字，训练出模型后，将新图片使用模型运算，就可以知道其对应的类别。

以上只是简单介绍，这里还是建议先学习机器学习、卷积神经网络的相关基础知识。

安装

这一部分网上有不少教程，这里就略掉，另外，我是用docker的镜像直接安装的，网上可以直接搜到带caffe的docker镜像。好处是省去安装环境的时间，缺点是后面设置文件会麻烦一些，建议从长计议还是直接安装在电脑上。

make runtest


[       OK ] AdaDeltaSolverTest/2.TestLeastSquaresUpdateWithEverythingAccumShare (6 ms)
[ RUN      ] AdaDeltaSolverTest/2.TestAdaDeltaLeastSquaresUpdateWithEverythingShare
[       OK ] AdaDeltaSolverTest/2.TestAdaDeltaLeastSquaresUpdateWithEverythingShare (61 ms)
[ RUN      ] AdaDeltaSolverTest/2.TestAdaDeltaLeastSquaresUpdateWithHalfMomentum
[       OK ] AdaDeltaSolverTest/2.TestAdaDeltaLeastSquaresUpdateWithHalfMomentum (21 ms)
[ RUN      ] AdaDeltaSolverTest/2.TestAdaDeltaLeastSquaresUpdateWithWeightDecay
[       OK ] AdaDeltaSolverTest/2.TestAdaDeltaLeastSquaresUpdateWithWeightDecay (10 ms)
[----------] 11 tests from AdaDeltaSolverTest/2 (324 ms total)

[----------] Global test environment tear-down
[==========] 2101 tests from 277 test cases ran. (435790 ms total)
[  PASSED  ] 2101 tests.
[100%] Built target runtest

训练cifar训练集

cifar是一个常见的图像分类训练集，包括上万张图片及20个分类，caffe提供了一个网络用于分类cifar数据集。

cifar网络的定义在examples/cifar10目录下，训练的过程十分简单。

（以下命令均在caffe默认根目录下运行，下同）

1、获取训练数据

cd $CAFFE_ROOT
./data/cifar10/get_cifar10.sh
./examples/cifar10/create_cifar10.sh

2、开始训练

cd $CAFFE_ROOT
./examples/cifar10/train_quick.sh

3、训练完成后我们会得到：

cifar10_quick_iter_4000.caffemodel.h5

cifar10_quick_iter_4000.solverstate.h5

此时，我们就训练得到了模型，用于后面的分类。

4、下面我们使用模型来分类新数据

先直接用一下别人的模型分类试一下：（默认用的ImageNet的模型）

python python/classify.py examples/images/cat.jpg foo

这样会生成一个名为foo.npy的输出文件，记录了所有的类型的“相似度”。使用以下语句可以读出：

import numpy as np
f=file("foo.npy", "rb")
print np.load(f)

下面我们来指定自己的模型进行分类：

python python/classify.py --model_def examples/cifar10/cifar10_quick.prototxt --pretrained_model examples/cifar10/cifar10_quick_iter_5000.caffemodel.h5 --center_only  examples/images/cat.jpg foo

上面这句话的意思是，使用cifar10_quick.prototxt网络 + cifar10_quick_iter_5000.caffemodel.h5模型，对examples/images/cat.jpg图片进行分类。

默认的classify脚本不会直接输出结果，而是会把结果输入到foo文件里，不太直观，这里我在网上找了一个修改版，添加了一些参数，可以输出概率最高的分类。

替换python/classify.py，下载地址：http://download.csdn.net/detail/caisenchuan/9513196 （这个要积分的，不推荐。）

实际上就是将对应内容改为

159     # Classify.
160     start = time.time()
161     predictions = classifier.predict(inputs, not args.center_only)
162     print("Done in %.2f s." % (time.time() - start))
163     print("Predictions : %s" % predictions)
164
165     # print result, add by caisenchuan
166     if args.print_results:
167         scores = predictions.flatten()
168         with open(args.labels_file) as f:
169             labels_df = pd.DataFrame([
170                     {
171                         \'synset_id\': l.strip().split(\' \')[0],
172                         \'name\': \' \'.join(l.strip().split(\' \')[1:]).split(\',\')[0]
173                     }
174                     for l in f.readlines()
175                 ])
176             labels = labels_df.sort(\'synset_id\')[\'name\'].values
177
178             indices = (-scores).argsort()[:5]
179             ps = labels[indices]
180
181             meta = [
182                 (p, \'%.5f\' % scores[i])
183                 for i, p in zip(indices, ps)
184             ]
185
186             print meta
187
188     # Save
189     print("Saving results into %s" % args.output_file)
190     np.save(args.output_file, predictions)

这个脚本添加了两个参数，可以指定labels_file，然后可以直接把分类结果输出出来：

python python/classify.py --print_results --model_def examples/cifar10/cifar10_quick.prototxt --pretrained_model examples/cifar10/cifar10_quick_iter_5000.caffemodel.h5 --labels_file data/cifar10/cifar10_words.txt  --center_only  examples/images/cat.jpg foo

输出结果：

Loading file: examples/images/cat.jpg
Classifying 1 inputs.
predict 3 inputs.
Done in 0.02 s.
Predictions : [[ 0.03903743  0.00722749  0.04582177  0.44352672  0.01203315  0.11832549
   0.02335102  0.25013766  0.03541689  0.02512246]]
python/classify.py:176: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  labels = labels_df.sort(\'synset_id\')[\'name\'].values
[(\'cat\', \'0.44353\'), (\'horse\', \'0.25014\'), (\'dog\', \'0.11833\'), (\'bird\', \'0.04582\'), (\'airplane\', \'0.03904\')]
上面标明了各个分类的顺序和置信度
Saving results into foo

Tips

最后，总结一下训练一个网络用到的相关文件：

cifar10_quick_solver.prototxt：方案配置，用于配置迭代次数等信息，训练时直接调用caffe train指定这个文件，就会开始训练

cifar10_quick_train_test.prototxt：训练网络配置，用来设置训练用的网络，这个文件的名字会在solver.prototxt里指定

cifar10_quick_iter_4000.caffemodel.h5：训练出来的模型，后面就用这个模型来做分类

cifar10_quick_iter_4000.solverstate.h5：也是训练出来的，应该是用来中断后继续训练用的文件

cifar10_quick.prototxt：分类用的网络

//--------------------------------------------------------------------------------------------------------------------------------------------------------

需要说明的是：期间曾遇到这个错误，用下面的方法终于搞定。当然，需要重新编译caffe。

macOS Sierra (10.12.4)下Caffe执行Python代码报告错误“Mean shape incompatible with input shape”

时间 2017-04-06 20:04:47 默默的点滴

原文 http://www.mobibrw.com/2017/6809

主题 Caffe Python Sierra

在执行 macOS Sierra (10.12.4)下Caffe通过Python接口加载binaryproto格式的均值文件的时候，最后报告错误：

Traceback (mostrecentcall last):
  File "analysis_memnet.py", line 29, in <module>
    detector = caffe.Detector(model_def, pretrained_model, mean=means)
  File "/Users/Source/caffe/distribute/python/caffe/detector.py", line 46, in __init__
    self.transformer.set_mean(in_, mean)
  File "/Users/Source/caffe/distribute/python/caffe/io.py", line 259, in set_mean
    raiseValueError(\'Mean shape incompatible with input shape.\')
ValueError: Meanshapeincompatiblewithinputshape.

这个错误发生的原因是由于 memnet 提供的均值文件是 256 * 256 的，但是提供的配置文件却是 227 * 227 的，导致在 io .py 里面的代码在进行判断的时候发生异常。调整源代码中的 python /caffe / io .py 里面的代码：

    def set_mean(self, in_, mean):
        """
        Set the mean to subtract for centering the data.
 
        Parameters
        ----------
        in_ : which input to assign this mean.
        mean : mean ndarray (input dimensional or broadcastable)
        """
        self.__check_input(in_)
        ms = mean.shape
        if mean.ndim == 1:
            # broadcast channels
            if ms[0] != self.inputs[in_][1]:
                raise ValueError(\'Mean channels incompatible with input.\')
            mean = mean[:, np.newaxis, np.newaxis]
        else:
            # elementwise mean
            if len(ms) == 2:
                ms = (1,) + ms
            if len(ms) != 3:
                raise ValueError(\'Mean shape invalid\')
            if ms != self.inputs[in_][1:]:
                raise ValueError(\'Mean shape incompatible with input shape.\')
        self.mean[in_] = mean

调整为：

    def set_mean(self, in_, mean):
        """
        Set the mean to subtract for centering the data.
 
        Parameters
        ----------
        in_ : which input to assign this mean.
        mean : mean ndarray (input dimensional or broadcastable)
        """
        self.__check_input(in_)
        ms = mean.shape
        if mean.ndim == 1:
            # broadcast channels
            if ms[0] != self.inputs[in_][1]:
                raise ValueError(\'Mean channels incompatible with input.\')
            mean = mean[:, np.newaxis, np.newaxis]
        else:
            # elementwise mean
            if len(ms) == 2:
                ms = (1,) + ms
            if len(ms) != 3:
                raise ValueError(\'Mean shape invalid\')
            if ms != self.inputs[in_][1:]:
                in_shape = self.inputs[in_][1:]
                m_min, m_max = mean.min(), mean.max()
                normal_mean = (mean - m_min) / (m_max - m_min)
                mean = resize_image(normal_mean.transpose((1,2,0)),in_shape[1:]).transpose((2,0,1)) * (m_max - m_min) + m_min
                #raise ValueError(\'Mean shape incompatible with input shape.\')
        self.mean[in_] = mean

调整完成后，需要重新编译 Caffe :

$ make clean
$ make
$ make pycaffe
$ make distribute

参考链接

附录：

python/classify.py:

#!/usr/bin/env python
"""
classify.py is an out-of-the-box image classifer callable from the command line.

By default it configures and runs the Caffe reference ImageNet model.
"""
import numpy as np
import os
import sys
import argparse
import glob
import time

import pandas as pd
from skimage.color import rgb2gray

import caffe


def main(argv):
    pycaffe_dir = os.path.dirname(__file__)

    parser = argparse.ArgumentParser()
    # Required arguments: input and output files.
    parser.add_argument(
        "input_file",
        help="Input image, directory, or npy."
    )
    parser.add_argument(
        "output_file",
        help="Output npy filename."
    )
    # Optional arguments.
    parser.add_argument(
        "--model_def",
        default=os.path.join(pycaffe_dir,
                "../models/bvlc_reference_caffenet/deploy.prototxt"),
        help="Model definition file."
    )
    parser.add_argument(
        "--pretrained_model",
        default=os.path.join(pycaffe_dir,
                "../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel"),
        help="Trained model weights file."
    )
    parser.add_argument(
        "--gpu",
        action=\'store_true\',
        help="Switch for gpu computation."
    )
    parser.add_argument(
        "--center_only",
        action=\'store_true\',
        help="Switch for prediction from center crop alone instead of " +
             "averaging predictions across crops (default)."
    )
    parser.add_argument(
        "--images_dim",
        default=\'256,256\',
        help="Canonical \'height,width\' dimensions of input images."
    )
    parser.add_argument(
        "--mean_file",
        default=os.path.join(pycaffe_dir,
                             \'caffe/imagenet/ilsvrc_2012_mean.npy\'),
        help="Data set image mean of [Channels x Height x Width] dimensions " +
             "(numpy array). Set to \'\' for no mean subtraction."
    )
    parser.add_argument(
        "--input_scale",
        type=float,
        help="Multiply input features by this scale to finish preprocessing."
    )
    parser.add_argument(
        "--raw_scale",
        type=float,
        default=255.0,
        help="Multiply raw input by this scale before preprocessing."
    )
    parser.add_argument(
        "--channel_swap",
        default=\'2,1,0\',
        help="Order to permute input channels. The default converts " +
             "RGB -> BGR since BGR is the Caffe default by way of OpenCV."
    )
    parser.add_argument(
        "--ext",
        default=\'jpg\',
        help="Image file extension to take as input when a directory " +
             "is given as the input file."
    )
    
    # add by 以上是关于caffe学习--cifar10学习-ubuntu16.04-gtx650tiboost--1g--01的主要内容，如果未能解决你的问题，请参考以下文章