支持向量机的matlab代码

Posted 2023-05-13

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了支持向量机的matlab代码相关的知识，希望对你有一定的参考价值。

参考技术A 如果是7.0以上版本
>>edit svmtrain
>>edit svmclassify
>>edit svmpredict

function [svm_struct, svIndex] = svmtrain(training, groupnames, varargin)
%SVMTRAIN trains a support vector machine classifier
%
% SVMStruct = SVMTRAIN(TRAINING,GROUP) trains a support vector machine
% classifier using data TRAINING taken from two groups given by GROUP.
% SVMStruct contains information about the trained classifier that is
% used by SVMCLASSIFY for classification. GROUP is a column vector of
% values of the same length as TRAINING that defines two groups. Each
% element of GROUP specifies the group the corresponding row of TRAINING
% belongs to. GROUP can be a numeric vector, a string array, or a cell
% array of strings. SVMTRAIN treats NaNs or empty strings in GROUP as
% missing values and ignores the corresponding rows of TRAINING.
%
% SVMTRAIN(...,'KERNEL_FUNCTION',KFUN) allows you to specify the kernel
% function KFUN used to map the training data into kernel space. The
% default kernel function is the dot product. KFUN can be one of the
% following strings or a function handle:
%
% 'linear' Linear kernel or dot product
% 'quadratic' Quadratic kernel
% 'polynomial' Polynomial kernel (default order 3)
% 'rbf' Gaussian Radial Basis Function kernel
% 'mlp' Multilayer Perceptron kernel (default scale 1)
% function A kernel function specified using @,
% for example @KFUN, or an anonymous function
%
% A kernel function must be of the form
%
% function K = KFUN(U, V)
%
% The returned value, K, is a matrix of size M-by-N, where U and V have M
% and N rows respectively. If KFUN is parameterized, you can use
% anonymous functions to capture the problem-dependent parameters. For
% example, suppose that your kernel function is
%
% function k = kfun(u,v,p1,p2)
% k = tanh(p1*(u*v')+p2);
%
% You can set values for p1 and p2 and then use an anonymous function:
% @(u,v) kfun(u,v,p1,p2).
%
% SVMTRAIN(...,'POLYORDER',ORDER) allows you to specify the order of a
% polynomial kernel. The default order is 3.
%
% SVMTRAIN(...,'MLP_PARAMS',[P1 P2]) allows you to specify the
% parameters of the Multilayer Perceptron (mlp) kernel. The mlp kernel
% requires two parameters, P1 and P2, where K = tanh(P1*U*V' + P2) and P1
% > 0 and P2 < 0. Default values are P1 = 1 and P2 = -1.
%
% SVMTRAIN(...,'METHOD',METHOD) allows you to specify the method used
% to find the separating hyperplane. Options are
%
% 'QP' Use quadratic programming (requires the Optimization Toolbox)
% 'LS' Use least-squares method
%
% If you have the Optimization Toolbox, then the QP method is the default
% method. If not, the only available method is LS.
%
% SVMTRAIN(...,'QUADPROG_OPTS',OPTIONS) allows you to pass an OPTIONS
% structure created using OPTIMSET to the QUADPROG function when using
% the 'QP' method. See help optimset for more details.
%
% SVMTRAIN(...,'SHOWPLOT',true), when used with two-dimensional data,
% creates a plot of the grouped data and plots the separating line for
% the classifier.
%
% Example:
% % Load the data and select features for classification
% load fisheriris
% data = [meas(:,1), meas(:,2)];
% % Extract the Setosa class
% groups = ismember(species,'setosa');
% % Randomly select training and test sets
% [train, test] = crossvalind('holdOut',groups);
% cp = classperf(groups);
% % Use a linear support vector machine classifier
% svmStruct = svmtrain(data(train,:),groups(train),'showplot',true);
% classes = svmclassify(svmStruct,data(test,:),'showplot',true);
% % See how well the classifier performed
% classperf(cp,classes,test);
% cp.CorrectRate
%
% See also CLASSIFY, KNNCLASSIFY, QUADPROG, SVMCLASSIFY.

% Copyright 2004 The MathWorks, Inc.
% $Revision: 1.1.12.1 $ $Date: 2004/12/24 20:43:35 $

% References:
% [1] Kecman, V, Learning and Soft Computing,
% MIT Press, Cambridge, MA. 2001.
% [2] Suykens, J.A.K., Van Gestel, T., De Brabanter, J., De Moor, B.,
% Vandewalle, J., Least Squares Support Vector Machines,
% World Scientific, Singapore, 2002.
% [3] Scholkopf, B., Smola, A.J., Learning with Kernels,
% MIT Press, Cambridge, MA. 2002.

%
% SVMTRAIN(...,'KFUNARGS',ARGS) allows you to pass additional
% arguments to kernel functions.

% set defaults

plotflag = false;
qp_opts = [];
kfunargs = ;
setPoly = false; usePoly = false;
setMLP = false; useMLP = false;
if ~isempty(which('quadprog'))
useQuadprog = true;
else
useQuadprog = false;
end
% set default kernel function
kfun = @linear_kernel;

% check inputs
if nargin < 2
error(nargchk(2,Inf,nargin))
end

numoptargs = nargin -2;
optargs = varargin;

% grp2idx sorts a numeric grouping var ascending, and a string grouping
% var by order of first occurrence

[g,groupString] = grp2idx(groupnames);

% check group is a vector -- though char input is special...
if ~isvector(groupnames) && ~ischar(groupnames)
error('Bioinfo:svmtrain:GroupNotVector',...
'Group must be a vector.');
end

% make sure that the data is correctly oriented.
if size(groupnames,1) == 1
groupnames = groupnames';
end
% make sure data is the right size
n = length(groupnames);
if size(training,1) ~= n
if size(training,2) == n
training = training';
else
error('Bioinfo:svmtrain:DataGroupSizeMismatch',...
'GROUP and TRAINING must have the same number of rows.')
end
end

% NaNs are treated as unknown classes and are removed from the training
% data
nans = find(isnan(g));
if length(nans) > 0
training(nans,:) = [];
g(nans) = [];
end
ngroups = length(groupString);

if ngroups > 2
error('Bioinfo:svmtrain:TooManyGroups',...
'SVMTRAIN only supports classification into two groups.\nGROUP contains %d different groups.',ngroups)
end
% convert to 1, -1.
g = 1 - (2* (g-1));

% handle optional arguments

if numoptargs >= 1
if rem(numoptargs,2)== 1
error('Bioinfo:svmtrain:IncorrectNumberOfArguments',...
'Incorrect number of arguments to %s.',mfilename);
end
okargs = 'kernel_function','method','showplot','kfunargs','quadprog_opts','polyorder','mlp_params';
for j=1:2:numoptargs
pname = optargsj;
pval = optargsj+1;
k = strmatch(lower(pname), okargs);%#ok
if isempty(k)
error('Bioinfo:svmtrain:UnknownParameterName',...
'Unknown parameter name: %s.',pname);
elseif length(k)>1
error('Bioinfo:svmtrain:AmbiguousParameterName',...
'Ambiguous parameter name: %s.',pname);
else
switch(k)
case 1 % kernel_function
if ischar(pval)
okfuns = 'linear','quadratic',...
'radial','rbf','polynomial','mlp';
funNum = strmatch(lower(pval), okfuns);%#ok
if isempty(funNum)
funNum = 0;
end
switch funNum %maybe make this less strict in the future
case 1
kfun = @linear_kernel;
case 2
kfun = @quadratic_kernel;
case 3,4
kfun = @rbf_kernel;
case 5
kfun = @poly_kernel;
usePoly = true;
case 6
kfun = @mlp_kernel;
useMLP = true;
otherwise
error('Bioinfo:svmtrain:UnknownKernelFunction',...
'Unknown Kernel Function %s.',kfun);
end
elseif isa (pval, 'function_handle')
kfun = pval;
else
error('Bioinfo:svmtrain:BadKernelFunction',...
'The kernel function input does not appear to be a function handle\nor valid function name.')
end
case 2 % method
if strncmpi(pval,'qp',2)
useQuadprog = true;
if isempty(which('quadprog'))
warning('Bioinfo:svmtrain:NoOptim',...
'The Optimization Toolbox is required to use the quadratic programming method.')
useQuadprog = false;
end
elseif strncmpi(pval,'ls',2)
useQuadprog = false;
else
error('Bioinfo:svmtrain:UnknownMethod',...
'Unknown method option %s. Valid methods are ''QP'' and ''LS''',pval);

end
case 3 % display
if pval ~= 0
if size(training,2) == 2
plotflag = true;
else
warning('Bioinfo:svmtrain:OnlyPlot2D',...
'The display option can only plot 2D training data.')
end

end
case 4 % kfunargs
if iscell(pval)
kfunargs = pval;
else
kfunargs = pval;
end
case 5 % quadprog_opts
if isstruct(pval)
qp_opts = pval;
elseif iscell(pval)
qp_opts = optimset(pval:);
else
error('Bioinfo:svmtrain:BadQuadprogOpts',...
'QUADPROG_OPTS must be an opts structure.');
end
case 6 % polyorder
if ~isscalar(pval) || ~isnumeric(pval)
error('Bioinfo:svmtrain:BadPolyOrder',...
'POLYORDER must be a scalar value.');
end
if pval ~=floor(pval) || pval < 1
error('Bioinfo:svmtrain:PolyOrderNotInt',...
'The order of the polynomial kernel must be a positive integer.')
end
kfunargs = pval;
setPoly = true;

case 7 % mlpparams
if numel(pval)~=2
error('Bioinfo:svmtrain:BadMLPParams',...
'MLP_PARAMS must be a two element array.');
end
if ~isscalar(pval(1)) || ~isscalar(pval(2))
error('Bioinfo:svmtrain:MLPParamsNotScalar',...
'The parameters of the multi-layer perceptron kernel must be scalar.');
end
kfunargs = pval(1),pval(2);
setMLP = true;
end
end
end
end
if setPoly && ~usePoly
warning('Bioinfo:svmtrain:PolyOrderNotPolyKernel',...
'You specified a polynomial order but not a polynomial kernel');
end
if setMLP && ~useMLP
warning('Bioinfo:svmtrain:MLPParamNotMLPKernel',...
'You specified MLP parameters but not an MLP kernel');
end
% plot the data if requested
if plotflag
[hAxis,hLines] = svmplotdata(training,g);
legend(hLines,cellstr(groupString));
end

% calculate kernel function
try
kx = feval(kfun,training,training,kfunargs:);
% ensure function is symmetric
kx = (kx+kx')/2;
catch
error('Bioinfo:svmtrain:UnknownKernelFunction',...
'Error calculating the kernel function:\n%s\n', lasterr);
end
% create Hessian
% add small constant eye to force stability
H =((g*g').*kx) + sqrt(eps(class(training)))*eye(n);

if useQuadprog
% The large scale solver cannot handle this type of problem, so turn it
% off.
qp_opts = optimset(qp_opts,'LargeScale','Off');
% X=QUADPROG(H,f,A,b,Aeq,beq,LB,UB,X0,opts)
alpha = quadprog(H,-ones(n,1),[],[],...
g',0,zeros(n,1),inf *ones(n,1),zeros(n,1),qp_opts);

% The support vectors are the non-zeros of alpha
svIndex = find(alpha > sqrt(eps));
sv = training(svIndex,:);

% calculate the parameters of the separating line from the support
% vectors.
alphaHat = g(svIndex).*alpha(svIndex);

% Calculate the bias by applying the indicator function to the support
% vector with largest alpha.
[maxAlpha,maxPos] = max(alpha); %#ok
bias = g(maxPos) - sum(alphaHat.*kx(svIndex,maxPos));
% an alternative method is to average the values over all support vectors
% bias = mean(g(sv)' - sum(alphaHat(:,ones(1,numSVs)).*kx(sv,sv)));

% An alternative way to calculate support vectors is to look for zeros of
% the Lagrangians (fifth output from QUADPROG).
%
% [alpha,fval,output,exitflag,t] = quadprog(H,-ones(n,1),[],[],...
% g',0,zeros(n,1),inf *ones(n,1),zeros(n,1),opts);
%
% sv = t.lower < sqrt(eps) & t.upper < sqrt(eps);
else % Least-Squares
% now build up compound matrix for solver

A = [0 g';g,H];
b = [0;ones(size(g))];
x = A\b;

% calculate the parameters of the separating line from the support
% vectors.
sv = training;
bias = x(1);
alphaHat = g.*x(2:end);
end

svm_struct.SupportVectors = sv;
svm_struct.Alpha = alphaHat;
svm_struct.Bias = bias;
svm_struct.KernelFunction = kfun;
svm_struct.KernelFunctionArgs = kfunargs;
svm_struct.GroupNames = groupnames;
svm_struct.FigureHandles = [];
if plotflag
hSV = svmplotsvs(hAxis,svm_struct);
svm_struct.FigureHandles = hAxis,hLines,hSV;
end本回答被提问者采纳

支持向量机

参考技术A

本文主要参考了李航的《统计学习方法》。是本人学习支持向量机的学习笔记。
首先对支持向量机做简单介绍，然后分别介绍以下三个模型：
（1）线性可分支持向量机： 又称为硬间隔支持向量机，通过硬间隔最大化来学习一个线性分类器。适合 数据线性可分 情况；
（2）线性支持向量机： 又称为软间隔支持向量机，通过软间隔最大化来学习一个线性分类器。适合 数据近似线性可分 情况；
（3）非线性支持向量机： 通过核技巧和软间隔最大化来学一个非线性分类器。适合 数据非线性可分 情况
本文将对三个模型的介绍，从原始问题导出对偶问题。得到对偶问题以后，通过SMO算法对模型参数进行求解。最后，如果有机会再介绍以下支持向量机模型参数是如何利用SMO算法学习和训练的。

两堆数据怎么样才是线性可分就不再赘述，否则请出门左拐百度“线性可分”。支持向量机学习的目的是找到一个将两类数据分离的超平面，这个超平面可以描述为：

但实际上，我们通过给定的线性可分数据集能够拟合出来的模型为：

其中带了星号的和是超平面模型的参数，表示是从数据集中学习得到的经验值或者说是估计值。与理论上的模型差别就在于这两个参数。如果数据足够多，那么经验值与理论值就近似相等了。

为什么要引入间隔呢？为什么还有除了函数间隔之外还有个几何间隔？

什么是间隔，间隔就是样本点与分离超平面之间的距离。支持向量机学习的目标就是将间隔最大化。
支持向量机在学习过程中最终目的是找到一个能将数据分离的超平面。但将数据分离完成后还不够完美，还需要使得这个分离超平面具有足够的正确性和确信度。
假设我们得到了一个超平面，如果有一个点，则我们可以采用来表示分类的正确性和确信度。 的正负取值描述正确性；的取值描述确信度。

我们用变量来表示第i个样本与超平面之间的函数间隔描述式：

在定义和寻找超平面的时候就是在训练集中寻找最小的函数间隔，即:

先不废话，直接给出几何间隔的描述式，然后再解释要引入几何间隔。免得看一堆字看的懵逼。

可以看到函数间隔和集合间隔相比，参数和的分母上多了个，为什么要这样做呢？因为我们需要对参数和进行约束。如果不进行约束，求出来的超平面与不加约束是相同的（毕竟和前面的系数可以约掉），但和的实际可能会大个好几倍，会导致超平面的确信度变得十分不可靠。因此，我们对函数间隔加以约束，引入几何间隔的概念。
在定义和寻找超平面的时候就是在训练集中寻找最小的几何间隔，即:

函数间隔和几何间隔的关系：

支持向量机学习的目的是找到一个几何间隔最大的、能正确划分数据集的分离超平面。有目标，有约束，那么就可以表示为一个有约束的最优化问题,用几何间隔描述：

用函数间隔描述：

为了方便转换为最优化问题，我们将约束项保留的同时，对积分得到，使得最大化问题等价转换为最小化；令 ; 利用两个数学技巧得到最终的最优化问题：
线性可分支持向量机最优化问题

我们求出最优解后，可以得到分离超平面：

对新样本进行决策分类函数为：

决策分类函数的意思就是将新样本的特征值带入式子中，根据得出正负取值来进行分类。
其中，函数：

原始问题：线性可分支持向量机最优化问题

为了导出它的对偶问题，我们构造一个拉格朗日函数：

根据拉格朗日对偶性，原始问题的对偶问题是极大极小问题

先求极小化问题，再求极大化问题。
(1)求极小化问题：
将对和求偏导并令其等于0

将上面两个式子得出的结果代回到：

于是就求得：

(2)求极大化问题：
我们把上一步的结果带入第二步中，再加上约束条件可以得到：

再把负号去掉，使得最大化问题等价转化为最小化问题

这样就得到了对偶问题的最优化问题，然后采用如SMO这种参数估计方法来对参数进行求解。
原始问题的解
假设我们求出了对偶最优化问题的解，则存在一个下标j使得，我们就可以根据关系推导出原始最优化问题的解 ( 这是一个定理，证明请参考李航的《统计学习方法》 ):

正如本文开篇所说的，线性支持向量机用来解决近似线性可分的数据分类问题。我们在线性可分支持向量机的基础对数据集中的每一个样本都引入一个松弛变量，并对目标函数引入一个惩罚项，改变原来的目标函数和约束条件，使得线性支持向量机的 原始问题 为：

根据原始问题构造拉格朗日函数：

根据拉格朗日对偶性，原始问题的对偶问题是极大极小问题

(1)求极小化问题
将对求偏导并令其等于0:

将上面的结果代回拉格朗日函数得到：

(2)求极大化问题
通过上一步我们求解得到了极小化问题的表达式，接下来我们求解极大化问题：

实际上，通过约束条件中的非零关系，可以进一步将约束条件简化为 .我们可以得到最终的 线性支持向量机的对偶最优化问题：

原始问题的解
原始问题的解与前面的线性可分支持向量机一样，假设我们求出了对偶最优化问题的解，则存在一个下标j使得，我们就可以根据关系推导出原始最优化问题的解 ( 这也是一个定理，证明请参考李航的《统计学习方法》 ):

对新样本进行决策分类函数的对偶形式为：

决策分类函数的意思就是将新样本的特征值带入式子中，根据得出正负取值来进行分类。
其中，函数：

非线性支持向量机中用一个核函数来替代输入实例向量之间的内积，从而实现了把线性不可分的低维数据映射成线性可分的高维数据，然后再用超平面对高维空间内的数据进行分类。

其实，可以看到上面的最优化问题和分类决策函数中只涉及到了输入实例# 的内积，因此我们可以通过核函数代替输入实例之间的内积。从而达到用核函数把数据映射到高维空间的目的。
我们用核函数来代替实例之间的内积后可以写出 非线性支持向量机的对偶最优化问题 和 分类决策函数：
最优化问题：

分类决策函数：

当核函数是正定核函数时，最优化问题是凸二次规划问题，解存在。

为了搞清楚这个问题，首先要想想提出核函数的动机什么？提出核函数的目的是为了把低维数据映射成高维数据啊，然后好用一个分类超平面对这些数据分类。但是映射完成后的高维空间是什么样的我们并不清楚，好像目前只能保证哪些函数可以作为核函数使用，而不能为每种输入数据分布巧妙地设计出一个个核函数。而实际应用中也是在尝试使用各种各样的核函数，如高斯核函数、多项式核函数、线性核函数、sigmoid核函数、拉普拉斯核函数、字符串核函数等。
既然不能对每次的输入数据设计出合适的核函数，我们总能讨论一下什么样的函数才有资格成为核函数，因此我们退而求其次，有空去了解一下为什么核函数必须要是正定核函数？虽然在实际应用中我们直接就采用几种常见的核函数进行尝试。

参考： https://blog.csdn.net/jiangjieqazwsx/article/details/51418681

以上是关于支持向量机的matlab代码的主要内容，如果未能解决你的问题，请参考以下文章