将置换后的数据放入 LibSVM 预计算内核
Posted
技术标签:
【中文标题】将置换后的数据放入 LibSVM 预计算内核【英文标题】:Putting permuted data into LibSVM precomputed kernel 【发布时间】:2014-06-21 14:21:55 【问题描述】:我目前正在做非常简单的 SVM 分类。我在 LibSVM 中使用带有 RBF 和 DTW 的预计算内核。
当我计算相似度(核)矩阵时,一切似乎都运行良好......直到我置换数据,然后计算核矩阵。
SVM 当然对于输入数据的排列是不变的。在下面的 Matlab 代码中,标有 '
我的 csv 文件格式为:LABEL、val1、val2、...、valN,所有 csv 文件都存储在文件夹 dirName 中。因此,字符串数组包含条目 '10_0.csv 10_1.csv .... 11_7.csv, 11_8.csv'(未置换)或置换时的其他顺序。
我也尝试过置换样本序列号的向量,但这没有什么区别。
function [SimilarityMatrixTrain, SimilarityMatrixTest, trainLabels, testLabels, PermSimilarityMatrixTrain, PermSimilarityMatrixTest, permTrainLabels, permTestLabels] = computeDistanceMatrix(dirName, verificationClass, trainFrac)
fileList = getAllFiles(dirName);
fileList = fileList(1:36);
trainLabels = [];
testLabels = [];
trainFiles = ;
testFiles = ;
permTrainLabels = [];
permTestLabels = [];
permTrainFiles = ;
permTestFiles = ;
n = 0;
sigma = 0.01;
trainFiles = fileList(1:2:end);
testFiles = fileList(2:2:end);
rng(3);
permTrain = randperm(length(trainFiles))
%rng(3); <- !!!!!!!!!!!
permTest = randperm(length(testFiles));
permTrainFiles = trainFiles(permTrain)
permTestFiles = testFiles(permTest);
noTrain = size(trainFiles);
noTest = size(testFiles);
SimilarityMatrixTrain = eye(noTrain);
PermSimilarityMatrixTrain = (noTrain);
SimilarityMatrixTest = eye(noTest);
PermSimilarityMatrixTest = eye(noTest);
% UNPERM
%Train
for i = 1 : noTrain
x = csvread(trainFilesi);
label = x(1);
trainLabels = [trainLabels, label];
for j = 1 : noTrain
y = csvread(trainFilesj);
dtwDistance = dtwWrapper(x(2:end), y(2:end));
rbfValue = exp((dtwDistance.^2)./(-2*sigma));
SimilarityMatrixTrain(i, j) = rbfValue;
n=n+1
end
end
SimilarityMatrixTrain = [(1:size(SimilarityMatrixTrain, 1))', SimilarityMatrixTrain];
%Test
for i = 1 : noTest
x = csvread(testFilesi);
label = x(1);
testLabels = [testLabels, label];
for j = 1 : noTest
y = csvread(testFilesj);
dtwDistance = dtwWrapper(x(2:end), y(2:end));
rbfValue = exp((dtwDistance.^2)./(-2*sigma));
SimilarityMatrixTest(i, j) = rbfValue;
n=n+1
end
end
SimilarityMatrixTest = [(1:size(SimilarityMatrixTest, 1))', SimilarityMatrixTest];
% PERM
%Train
for i = 1 : noTrain
x = csvread(permTrainFilesi);
label = x(1);
permTrainLabels = [permTrainLabels, label];
for j = 1 : noTrain
y = csvread(permTrainFilesj);
dtwDistance = dtwWrapper(x(2:end), y(2:end));
rbfValue = exp((dtwDistance.^2)./(-2*sigma));
PermSimilarityMatrixTrain(i, j) = rbfValue;
n=n+1
end
end
PermSimilarityMatrixTrain = [(1:size(PermSimilarityMatrixTrain, 1))', PermSimilarityMatrixTrain];
%Test
for i = 1 : noTest
x = csvread(permTestFilesi);
label = x(1);
permTestLabels = [permTestLabels, label];
for j = 1 : noTest
y = csvread(permTestFilesj);
dtwDistance = dtwWrapper(x(2:end), y(2:end));
rbfValue = exp((dtwDistance.^2)./(-2*sigma));
PermSimilarityMatrixTest(i, j) = rbfValue;
n=n+1
end
end
PermSimilarityMatrixTest = [(1:size(PermSimilarityMatrixTest, 1))', PermSimilarityMatrixTest];
mdlU = svmtrain(trainLabels', SimilarityMatrixTrain, '-t 4 -c 0.5');
mdlP = svmtrain(permTrainLabels', PermSimilarityMatrixTrain, '-t 4 -c 0.5');
[pclassU, xU, yU] = svmpredict(testLabels', SimilarityMatrixTest, mdlU);
[pclassP, xP, yP] = svmpredict(permTestLabels', PermSimilarityMatrixTest, mdlP);
xU
xP
end
我会非常感谢任何答案!
问候 本杰明
【问题讨论】:
好吧,我不知道 *** 是否适合我的问题,所以我决定也将其发布到 stats.stackexchange.com (stats.stackexchange.com/questions/96452/…)。随时在这里或那里回答我的问题。亲爱的版主:如果这对你来说不合适,请随时删除我的帖子。非常感谢! 【参考方案1】:在清理代码并让我的一位同事查看后,我们/他终于找到了错误。当然,我必须从训练 和 测试样本中计算测试矩阵(让 SVM 通过使用训练向量的 alpha 值乘积的总和来预测测试数据(它们是非支持向量为零))。希望这可以为你们中的任何人澄清问题。为了更清楚,请参阅下面的修改后的代码。但是,例如在using precomputed kernels with libsvm 中,眼睛敏锐的人也可以看到带有训练和测试向量的测试矩阵的计算。如果您有任何进一步的评论/问题/提示,请随时在此帖子中添加 cmets 或/和答案!
function [tacc, testacc, mdl, SimilarityMatrixTrain, SimilarityMatrixTest, trainLabels, testLabels] = computeSimilarityMatrix(dirName)
fileList = getAllFiles(dirName);
fileList = fileList(1:72);
trainLabels = [];
testLabels = [];
trainFiles = ;
testFiles = ;
n = 0;
sigma = 0.01;
trainFiles = fileList(1:2:end);
testFiles = fileList(2:5:end);
noTrain = size(trainFiles);
noTest = size(testFiles);
permTrain = randperm(noTrain(1));
permTest = randperm(noTest(1));
trainFiles = trainFiles(permTrain);
testFiles = testFiles(permTest);
%Train
for i = 1 : noTrain(1)
x = csvread(trainFilesi);
label = x(1);
trainlabel = label;
trainLabels = [trainLabels, label];
for j = 1 : noTrain(1)
y = csvread(trainFilesj);
dtwDistance = dtwWrapper(x(2:end), y(2:end));
rbfValue = exp((dtwDistance.^2)./(-2*sigma.^2));
SimilarityMatrixTrain(i, j) = rbfValue;
end
end
SimilarityMatrixTrain = [(1:size(SimilarityMatrixTrain, 1))', SimilarityMatrixTrain];
%Test
for i = 1 : noTest(1)
x = csvread(testFilesi);
label = x(1);
testlabel = label;
testLabels = [testLabels, label];
for j = 1 : noTrain(1)
y = csvread(trainFilesj);
dtwDistance = dtwWrapper(x(2:end), y(2:end));
rbfValue = exp((dtwDistance.^2)./(-2*sigma.^2));
SimilarityMatrixTest(i, j) = rbfValue;
end
end
SimilarityMatrixTest = [(1:size(SimilarityMatrixTest, 1))', SimilarityMatrixTest];
mdlU = svmtrain(trainLabels', SimilarityMatrixTrain, '-t 4 -c 1000 -q');
fprintf('TEST: '); [pclassU, xU, yU] = svmpredict(testLabels', SimilarityMatrixTest, mdlU);
fprintf('TRAIN: ');[pclassT, xT, yT] = svmpredict(trainLabels', SimilarityMatrixTrain, mdlU);
tacc = xT(1);
testacc = xU(1);
mdl = mdlU;
end
问候 本杰明
【讨论】:
以上是关于将置换后的数据放入 LibSVM 预计算内核的主要内容,如果未能解决你的问题,请参考以下文章
为啥在 matlab 中使用带有 libsvm 的预计算内核