使用迁移学习保存和加载 KMeans 聚类模型

Posted 2023-03-12

技术标签:

【中文标题】使用迁移学习保存和加载 KMeans 聚类模型【英文标题】：Saving and loading a KMeans clustering model with transfer learning 【发布时间】：2021-08-20 15:02:14 【问题描述】：

我在 Python 中有以下 KMeans 聚类模型，它仅在大约 5000 个未标记的图像上运行。我想保存模型并将其加载到单独的 python 脚本中，以简单地为其提供 1 个图像，以便我可以看到该图像属于哪个集群。保存和加载模型时会出现问题，我不确定它是否正确完成，因为我只是将kmodel 保存在泡菜文件中。

height = 299
width = 299
input_dir = "D:\\Glenn\\CNN\\Data\\Images"
glob_dir = input_dir + '/*.png'
images = [cv2.resize(cv2.imread(file), (height, width)) for file in glob.glob(glob_dir)]
paths = [file for file in glob.glob(glob_dir)]
images = np.array(np.float64(images))

model = tf.keras.applications.Xception(include_top=False, weights = "imagenet", input_shape=(height, width, 3))
predictions = model.predict(images.reshape(-1, height, width, 3))
pred_images = predictions.reshape(images.shape[0], -1)

k = 30
kmodel = KMeans(n_clusters = k, n_jobs=-1, random_state=728)
kmodel.fit(pred_images)
kpredictions = kmodel.predict(pred_images)

总的来说，我有点困惑，我正在做的甚至是最佳的。我使用剪影方法找到了最佳集群数量。但我不确定如何在单独的 python 文件中保存和加载这个模型，以仅预测集群 1 图像将在什么中。

【问题讨论】：

【参考方案1】：

您可以从经过训练的 tomotopy HDP 模型中提取主题，然后将 kmeans 聚类器初始化为 HDP 自动计算的主题数量。

【讨论】：

以上是关于使用迁移学习保存和加载 KMeans 聚类模型的主要内容，如果未能解决你的问题，请参考以下文章

R构建Kmeans聚类模型

PyTorch迁移学习教程（计算机视觉应用实例)

如何加载保存的 KMeans 模型（在 ML Pipeline 中）？

PyTorch 迁移学习 (Transfer Learning) 代码详解

吴裕雄 python 机器学习——K均值聚类KMeans模型