使用shell脚本批量运行caffe程序
Posted 小丫头い
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了使用shell脚本批量运行caffe程序相关的知识,希望对你有一定的参考价值。
写这个博文的初衷是这样的:老师让我训练LeNet,并且修改它的网络架构(多种变形),然后每一种做N次重复试验求平均值,最后和随机权重的网络进行比较;如此多的训练网络以及如此多重复的内容,便激发了我写shell脚本来自动化运行它;
主要的shell脚本
#!/usr/bin/env sh
folder="/path/"
solver="lenet_solver.prototxt" #solver文件保持不变
N=10 # 每个网络训练N次
for file in $folder*
do
filename=$(basename $file)
if [[ "$filename" == lenet_train_test*.prototxt ]] #用模糊匹配的方式遍历所有的网络架构文件
then
for i in $(seq $N)
do
python $foldermodifySolver.py $folder $solver $filename $i #这个python脚本用来修改solver中net和snap的位置
./build/tools/caffe train --solver=$folder$solver &> $folder$filename%.*"_log_"$i".md" # 保存日志文件,便于后续解析日志
done
fi
done
修改solver文件的python脚本
- 其实对于网络架构文件本来也可以使用这种方式,但是会产生大量冗余代码,于是手动写了网络架构,这样也不至于出错。
#!/usr/bin/python
import caffe
from caffe import proto
from google.protobuf.text_format import Merge
import sys
if __name__=="__main__":
if(len(sys.argv)<=4):
print "you should input three argv:folder,solver,net"
solver = proto.caffe_pb2.SolverParameter()
Merge((open(sys.argv[1]+sys.argv[2],'r').read()), solver)
solver.net = sys.argv[1]+sys.argv[3] # change net file name
solver.snapshot_prefix = sys.argv[1]+sys.argv[3][:sys.argv[3].find(".")]+sys.argv[4] # change model prefix
with open(sys.argv[1]+sys.argv[2], 'w') as f:
f.write(str(solver))
提取日志信息,对实验结果取平均值
import re
import numpy as np
import os
# input log file and output accuracy array
def parse(filepath):
pattern = re.compile(r".*Test net output #0: accuracy = (.*)")
f = open(filepath).readlines()
lst = []
for line in f:
match = pattern.match(line)
if match:
lst.append(float(match.group(1)))
return lst
folder = "/home/shipan/Work/MnistRandom/"
# get all the net name(without postfix) and log name(with postfix)
netNames = []
logNames = []
path = os.walk(folder)
for root,dirs,files in path:
for file in files:
if(file.startswith('lenet_train_test') and file.endswith('.prototxt')):
netNames.append(file[:file.index('.')])
if(file.startswith('lenet_train_test') and file.endswith('.md')):
logNames.append(file)
# get all the accuracy data
allResults = [] # get all the accuracy data
average = [] # get each net's average accuracy
output = open(folder+'averageAccuracy.md','w')
for netName in netNames:
allResult = []
finalResult = []
for logName in logNames:
if(logName.startswith(netName+'_log')):
accuracy_log = parse(folder+logName)
allResult.append(accuracy_log)
finalResult.append(accuracy_log[-1]) # just save the final result
allResults.append(allResult)
sum = 0
output.write("(Accuracy)ten times training of "+str(netName)+" is:"+str(finalResult))
output.write('\\n')
for result in finalResult:
sum += result
average.append(sum/10)
output.write("average accuracys are: "+str(average))
以上是关于使用shell脚本批量运行caffe程序的主要内容,如果未能解决你的问题,请参考以下文章