数据标准化后如何使用 K-Nearest Neighbors (KNN) 模型进行预测 (Python)

Posted 2023-03-12

技术标签:

【中文标题】数据标准化后如何使用 K-Nearest Neighbors (KNN) 模型进行预测 (Python)【英文标题】：How to make predictions using K-Nearest Neighbors (KNN) model when data has been normalized (Python) 【发布时间】：2020-07-16 05:08:28 【问题描述】：

我使用三个变量（年龄、距离、旅行津贴）作为我的预测变量，在 Python（Module = Scikitlearn）中创建了一个 KNN 模型，目的是使用它们来预测结果对于目标变量（出行方式）。

在构建模型时，我必须对三个预测变量（年龄、距离、旅行津贴）的数据进行标准化。与不规范化数据相比，这提高了我的模型的准确性。

现在我已经构建了模型，我想进行预测。但是，由于模型已经在标准化数据上进行了训练，我将如何输入预测变量来进行预测。

我想输入KNN.predict([[30,2000,40]])进行预测，年龄=30；距离 = 2000； Allowance = 40。但是由于数据已经标准化，我想不出如何做到这一点。我使用以下代码来规范化数据：X = preprocessing.StandardScaler().fit(X).transform(X.astype(float))

【问题讨论】：

【参考方案1】：

其实答案就在你提供的代码中！

一旦你适应了preprocessing.StandardScaler() 的实例，它就会记住如何缩放数据。试试这个

scaler = preprocessing.StandardScaler().fit(X)
# scaler is an object that knows how to normalize data points
X_normalized = scaler.transform(X.astype(float))
# used scalar to normalize the data points in X
# Note, this is what you have done, just in two steps. 
# I just capture the scaler object 
#
# ... Train your model on X_normalized
#
# Now predict
other_data = [[30,2000,40]]
other_data_normalized = scaler.transform(other_data)
KNN.predict(other_data_normalized)

请注意，我以相同的方式使用了两次scaler.transform

见StandardScaler.transform

【讨论】：

嗨迈克尔 - 这工作完美，谢谢！我不知道 Python 记得数据归一化的方式。

以上是关于数据标准化后如何使用 K-Nearest Neighbors (KNN) 模型进行预测 (Python)的主要内容，如果未能解决你的问题，请参考以下文章