如何在 sklearn Logistic Regression 的 one-vs-rest 方案中对概率进行归一化？

Posted 2023-03-12

技术标签:

【中文标题】如何在 sklearn Logistic Regression 的 one-vs-rest 方案中对概率进行归一化？【英文标题】：How the probabilities are normalized in one-vs-rest scheme of sklearn Logistic Regression? 【发布时间】：2021-12-15 04:42:52 【问题描述】：

在sklearn LogisticRegression分类器中，我们可以将muti_class选项设置为ovr，代表one-vs-rest，如以下代码sn-p：

# logistic regression for multi-class classification using built-in one-vs-rest
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
# define dataset
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=5, n_classes=3, random_state=1)
# define model
model = LogisticRegression(multi_class='ovr')
# fit model
model.fit(X, y)

现在，这个分类器可以为给定实例分配概率到不同的类：

# make predictions
yhat = model.predict_proba(X)

每个实例的概率总和为 1：

array([[0.16973178, 0.46755188, 0.36271634],
       [0.58228627, 0.0928127 , 0.32490103],
       [0.28241256, 0.51175978, 0.20582766],
       ...,
       [0.17922774, 0.71300755, 0.10776471],
       [0.05888508, 0.24924809, 0.69186683],
       [0.25808835, 0.68599321, 0.05591844]])

我的问题：在 one-vs-rest 方法中，为每个类训练一个分类器。因此，我们期望每个类别的概率独立于其他类别。如何将概率归一化为总和为 1？

【问题讨论】：

【参考方案1】：

概率通过除以行总和（即每个样本的类概率之和）进行归一化，这是source code：

prob /= prob.sum(axis=1).reshape((prob.shape[0], -1))

下面的代码展示了如何使用这个公式来复制模型输出：

import numpy as np
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression

# generate some data
X, y = make_classification(n_classes=3, n_features=10, n_informative=5, n_redundant=5, n_samples=1000, random_state=1)

# fit the model
model = LogisticRegression(multi_class='ovr')
model.fit(X, y)

prob_pred = model.predict_proba(X)
print(prob_pred)
# [[0.16973178 0.46755188 0.36271634]
#  [0.58228627 0.0928127  0.32490103]
#  [0.28241256 0.51175978 0.20582766]
#  ...

class_pred = model.predict(X)
print(class_pred)
# [1 0 1 2 0 2 1 2 0 1 1 0 2 1 0 1 2 0 1 0 ...

# replicate the model's outputs
classes = np.unique(y)
n_classes = len(classes)
n_samples = len(y)

prob_pred = np.zeros((n_samples, n_classes))
class_pred = np.zeros(n_samples)

for c in classes:

    y_ = np.where(y == c, 1, 0)

    model = LogisticRegression()
    model.fit(X, y_)

    prob_pred[:, c] = model.predict_proba(X)[:, 1]

prob_pred /= prob_pred.sum(axis=1).reshape((prob_pred.shape[0], -1))
print(prob_pred)
# [[0.16973178 0.46755188 0.36271634]
#  [0.58228627 0.0928127  0.32490103]
#  [0.28241256 0.51175978 0.20582766]
#  ...

class_pred = classes[np.argmax(prob_pred, axis=1)]
print(class_pred)
# [1 0 1 2 0 2 1 2 0 1 1 0 2 1 0 1 2 0 1 0 ...

【讨论】：

【参考方案2】：

如您所见here，通过将实例 x 在所有类上的每个类的分数归一化来处理多类，如下所示：属于类 k 的实例由下式给出

f代表决策函数，K代表类数。

【讨论】：

以上是关于如何在 sklearn Logistic Regression 的 one-vs-rest 方案中对概率进行归一化？的主要内容，如果未能解决你的问题，请参考以下文章