cluster analysis in data mining
Posted rsapaper
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了cluster analysis in data mining相关的知识,希望对你有一定的参考价值。
https://en.wikipedia.org/wiki/K-means_clustering
k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
The problem is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms. Additionally, they both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.
The algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means because of the k in the name. One can apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data into the existing clusters. This is known as nearest centroid classifier or Rocchio algorithm[citation needed].
以上是关于cluster analysis in data mining的主要内容,如果未能解决你的问题,请参考以下文章
Python for Data Analysis | NumPy
Data Structures and Algorithm Analysis in C++书后的习题答案
《Data Structure and Algorithm Analysis in Java》第 3 章 - 表
数据结构与算法分析Data Structures and Algorithm Analysis in C++
Data Structures and Algorithm Analysis in C++书后的习题答案
Cytoscape 安装教程 | Network Data Integration, Analysis, and Visualization in a Box