如何使用自定义 SVM 内核？

Posted 2023-02-23

技术标签:

【中文标题】如何使用自定义 SVM 内核？【英文标题】：How to use a custom SVM kernel? 【发布时间】：2015-01-13 17:50:19 【问题描述】：

我想在 Python 中实现我自己的高斯内核，只是为了练习。我正在使用： sklearn.svm.SVC(kernel=my_kernel) 但我真的不明白这是怎么回事。

我希望函数 my_kernel 以 X 矩阵的列作为参数调用，而不是使用 X、X 作为参数调用它。看看这些例子，事情就不是很清楚了。

我错过了什么？

这是我的代码：

'''
Created on 15 Nov 2014

@author: Luigi
'''
import scipy.io
import numpy as np
from sklearn import svm
import matplotlib.pyplot as plt

def svm_class(fileName):

    data = scipy.io.loadmat(fileName)
    X = data['X']
    y = data['y']

    f = svm.SVC(kernel = 'rbf', gamma=50, C=1.0)
    f.fit(X,y.flatten())
    plotData(np.hstack((X,y)), X, f)

    return

def plotData(arr, X, f):

    ax = plt.subplot(111)

    ax.scatter(arr[arr[:,2]==0][:,0], arr[arr[:,2]==0][:,1], c='r', marker='o', label='Zero')
    ax.scatter(arr[arr[:,2]==1][:,0], arr[arr[:,2]==1][:,1], c='g', marker='+', label='One')

    h = .02  # step size in the mesh
    # create a mesh to plot in
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                         np.arange(y_min, y_max, h))


    # Plot the decision boundary. For that, we will assign a color to each
    # point in the mesh [x_min, m_max]x[y_min, y_max].
    Z = f.predict(np.c_[xx.ravel(), yy.ravel()])

    # Put the result into a color plot
    Z = Z.reshape(xx.shape)
    plt.contour(xx, yy, Z)



    plt.xlim(np.min(arr[:,0]), np.max(arr[:,0]))
    plt.ylim(np.min(arr[:,1]), np.max(arr[:,1]))
    plt.show()
    return


def gaussian_kernel(x1,x2):
    sigma = 0.5
    return np.exp(-np.sum((x1-x2)**2)/(2*sigma**2))

if __name__ == '__main__':

    fileName = 'ex6data2.mat'
    svm_class(fileName)

【问题讨论】：

【参考方案1】：

在阅读了上面的答案以及其他一些问题和网站（1、2、3、4、5）之后，我将这些放在一起用于svm.SVC() 中的高斯内核。

使用kernel=precomputed 致电svm.SVC()。

然后计算一个Gram Matrix a.k.a. 内核矩阵（通常缩写为 K）。

然后使用这个 Gram 矩阵作为svm.SVC().fit() 的第一个参数（即 X）：

我从following code开始：

C=0.1
model = svmTrain(X, y, C, "gaussian")

在svmTrain() 中调用sklearn.svm.SVC()，然后调用sklearn.svm.SVC().fit()：

from sklearn import svm

if kernelFunction == "gaussian":
    clf = svm.SVC(C = C, kernel="precomputed")
    return clf.fit(gaussianKernelGramMatrix(X,X), y)

格拉姆矩阵计算 - 用作 sklearn.svm.SVC().fit() 的参数 - 在 gaussianKernelGramMatrix() 中完成：

import numpy as np

def gaussianKernelGramMatrix(X1, X2, K_function=gaussianKernel):
    """(Pre)calculates Gram Matrix K"""

    gram_matrix = np.zeros((X1.shape[0], X2.shape[0]))
    for i, x1 in enumerate(X1):
        for j, x2 in enumerate(X2):
            gram_matrix[i, j] = K_function(x1, x2)
    return gram_matrix

它使用gaussianKernel() 获得x1 和x2 之间的径向基函数内核（a measure of similarity based on a gaussian distribution centered on x1 with sigma=0.1）：

def gaussianKernel(x1, x2, sigma=0.1):

    # Ensure that x1 and x2 are column vectors
    x1 = x1.flatten()
    x2 = x2.flatten()

    sim = np.exp(- np.sum( np.power((x1 - x2),2) ) / float( 2*(sigma**2) ) )

    return sim

然后，一旦使用此自定义内核训练模型，我们就会使用 "the [custom] kernel between the test data and the training data" 进行预测：

predictions = model.predict( gaussianKernelGramMatrix(Xval, X) )

简而言之，要使用自定义的 SVM 高斯内核，可以使用这个 sn-p：

import numpy as np
from sklearn import svm

def gaussianKernelGramMatrixFull(X1, X2, sigma=0.1):
    """(Pre)calculates Gram Matrix K"""

    gram_matrix = np.zeros((X1.shape[0], X2.shape[0]))
    for i, x1 in enumerate(X1):
        for j, x2 in enumerate(X2):
            x1 = x1.flatten()
            x2 = x2.flatten()
            gram_matrix[i, j] = np.exp(- np.sum( np.power((x1 - x2),2) ) / float( 2*(sigma**2) ) )
    return gram_matrix

X=...
y=...
Xval=...

C=0.1
clf = svm.SVC(C = C, kernel="precomputed")
model = clf.fit( gaussianKernelGramMatrixFull(X,X), y )

p = model.predict( gaussianKernelGramMatrixFull(Xval, X) )

【讨论】：

在这种情况下，您的“Xval”是什么？那是用于运行预测的训练集吗？此外，由于“gaussianKernelGramMatrixFull”的输入必须是相同的尺寸，您是否必须手动调整 X 的大小才能正确生成 Gram？ @VinitNayak 不完全-Xvals 是我们想要获得其预测的值。我们使用已知标签y 对训练集x 进行了训练，但我们不知道Xval 的标签是什么，因此我们对其进行预测。 @VinitNayak X 和 Xval 不需要相同大小。它们只需要具有相同数量的列（m by n 矩阵中的n），就像您期望的训练数据和预测数据一样（即它们不应该有）相同数量的样本，但它们应该具有相同数量的特征）我的 X 是 622x9 矩阵，y 是 622x1 矩阵，Xval（这是我的交叉验证集）是 266x9。当我运行上述代码时，预测时出现以下错误：

svm/base.py", line 455, in _validate_for_predict     (X.shape[1], self.shape_fit_[0]))  ValueError: X.shape[1] = 9 should be equal to 622, the number of samples at training time

您的 X 中有 622 个样本和 9 个特征？在这种情况下，您有 266 个测试数据点，也有 9 个特征。您可能只需要转置数组。您可以在 numpy 中以 my_array.T 的形式执行此操作【参考方案2】：

出于效率原因，SVC 假设您的内核是一个接受 two matrices of samples、X 和 Y 的函数（它只会在训练期间使用两个相同的）并且您应该返回一个矩阵 G 其中：

G_ij = K(X_i, Y_j)

而K 是您的“点级”内核函数。

所以要么实现一个以这种通用方式工作的高斯内核，要么添加一个“代理”函数，例如：

def proxy_kernel(X,Y,K):
    gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
    for i, x in enumerate(X):
        for j, y in enumerate(Y):
            gram_matrix[i, j] = K(x, y)
    return gram_matrix

并像这样使用它：

from functools import partial
correct_gaussian_kernel = partial(proxy_kernel, K=gaussian_kernel)

【讨论】：

对不起，我不明白，可能是因为我是 ML 的初学者。返回一个 n=n_rows(X) 的方阵还不够，元素是 X[:,0] 和 X[:,1] 之间的所有组合？正如我所说，它不会总是以两个 X 作为参数调用，所以不会。此外，您的数据是多维的，因此 X[:,0] 只是每个向量的第一维，与这里无关。我应该返回什么？一个 Gram 矩阵，与答案中写的完全一样。矩阵，其中第 i 行和第 j 列包含来自第一个数组的第 i 个向量和来自第二个数组的第 j 个向量之间的内核值，因此 kernel(X,Y)_ij = K(X_i, Y_j) @lejlot 我的问题可能直接相关，因为我使用的是自计算内核。你能看看***.com/questions/47564504/…

以上是关于如何使用自定义 SVM 内核？的主要内容，如果未能解决你的问题，请参考以下文章