K-fold Train

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了K-fold Train相关的知识,希望对你有一定的参考价值。

# config.py
TRAINING_FILE = "../input/mnist_train_folds.csv"
MODEL_OUTPUT = "../models/"
===========================================================
# train.py
import os
import config
import joblib
import pandas as pd
from sklearn import metrics
from sklearn import tree
def run(fold):
# read the training data with folds
df = pd.read_csv(config.TRAINING_FILE)
# training data is where kfold is not equal to provided fold
# also, note that we reset the index
df_train = df[df.kfold != fold].reset_index(drop=True)
# validation data is where kfold is equal to provided fold
df_valid = df[df.kfold == fold].reset_index(drop=True)
# drop the label column from dataframe and convert it to
# a numpy array by using .values.
# target is label column in the dataframe
x_train = df_train.drop("label", axis=1).values
y_train = df_train.label.values
# similarly, for validation, we have
x_valid = df_valid.drop("label", axis=1).values
y_valid = df_valid.label.values
# initialize simple decision tree classifier from sklearn
clf = tree.DecisionTreeClassifier()
# fir the model on training data
clf.fit(x_train, y_train)
# create predictions for validation samples
preds = clf.predict(x_valid)
# calculate & print accuracy
accuracy = metrics.accuracy_score(y_valid, preds)
print(f"Fold={fold}, Accuracy={accuracy}")
# save the model
joblib.dump(
clf,
os.path.join(config.MODEL_OUTPUT, f"dt_{fold}.bin")
)


if __name__ == "__main__":
run(fold=0)
run(fold=1)
run(fold=2)
run(fold=3)
run(fold=4)

以上是关于K-fold Train的主要内容,如果未能解决你的问题,请参考以下文章

有人可以解释以下 R 代码片段吗? [关闭]

如何在 python 中计算 SSE

clf 在机器学习中是啥意思?

我的 r-squared 得分为负数,但我使用 k-fold 交叉验证的准确度得分约为 92%

从 K-Fold CV 中找到 Logistic 回归权重

python Keras K-fold