机器学习笔记(Washington University)- Clustering Specialization-week six

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了机器学习笔记(Washington University)- Clustering Specialization-week six相关的知识,希望对你有一定的参考价值。

1. Hierarchical clustering

  • Avoid choosing number of clusters beforehand
  • Dendrograms help visualize different clustering granularities (no need to rerun algorithm)
  • Most algorithm allow user to choose any distance metric (k-means restricted us to euclidean distance)
  • Can often find more  complex shapes than k-means or gaussian mixture model

Divisive (top-down):

start with all data in a big cluster and recursively split(recursive k-means)

  • which algorithm to recurse
  • how many clusters per split
  • when to split vs stop, max cluster size or max cluster radius or specified number of clusters

 

Agglomerative (bottom-up):

start with each data point at its own cluster, merge cluster until all points are in one big cluster (single linkage)

single linkage

  • initialize each point to be its own cluster
  • define distance between clusters to bb the minimum distance of C1 in cluster one and C2 in clustrer two
  • merge the two closest cluster
  • repeat step 3 until all points are in one cluster

 

Dendrogram

x axis shows data points (carefully ordered).

y axis shows distance between pairs of clusters.

Path shows all cluser to which a point belongs and the order in which clusters merge.

 

以上是关于机器学习笔记(Washington University)- Clustering Specialization-week six的主要内容,如果未能解决你的问题,请参考以下文章

机器学习笔记(Washington University)- Regression Specialization-week four

机器学习笔记(Washington University)- Classification Specialization-week 3

机器学习笔记(Washington University)- Regression Specialization-week five

机器学习笔记(Washington University)- Regression Specialization-week six

机器学习笔记(Washington University)- Regression Specialization-week one

机器学习笔记(Washington University)- Clustering Specialization-week four