r 主成分分析和聚类
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了r 主成分分析和聚类相关的知识,希望对你有一定的参考价值。
# Principal component analysis
# ++++++++++++++++++++++++++++++
data(iris)
res.pca <- prcomp(iris[, -5], scale = TRUE)
# Graph of individuals
# +++++++++++++++++++++
# Default plot
fviz_pca_ind(res.pca, col.ind = "#00AFBB")
# 1. Control automatically the color of individuals
# using the "cos2" or the contributions "contrib"
# cos2 = the quality of the individuals on the factor map
# 2. To keep only point or text use geom = "point" or geom = "text".
# 3. Change themes: http://www.sthda.com/english/wiki/ggplot2-themes
fviz_pca_ind(res.pca, col.ind="cos2", geom = "point")+
theme_minimal()
# Change gradient color
# Use repel = TRUE to avoid overplotting (slow if many points)
fviz_pca_ind(res.pca, col.ind="cos2", repel = TRUE) +
scale_color_gradient2(low = "white", mid = "#2E9FDF",
high= "#FC4E07", midpoint=0.6, space = "Lab")+
theme_minimal()
# Color individuals by groups, add concentration ellipses
# Remove labels: label = "none".
p <- fviz_pca_ind(res.pca, label="none", habillage=iris$Species,
addEllipses=TRUE, ellipse.level=.95)
print(p)
# Change group colors using RColorBrewer color palettes
# Read more: http://www.sthda.com/english/wiki/ggplot2-colors
p + scale_color_brewer(palette="Dark2") +
scale_fill_brewer(palette="Dark2") +
theme_minimal()
# Change group colors manually
# Read more: http://www.sthda.com/english/wiki/ggplot2-colors
p + scale_color_manual(values=c("#999999", "#E69F00", "#56B4E9"))+
scale_fill_manual(values=c("#999999", "#E69F00", "#56B4E9"))+
theme_minimal()
# Select and visualize some individuals (ind) with select.ind argument.
# - ind with cos2 >= 0.96: select.ind = list(cos2 = 0.96)
# - Top 20 ind according to the cos2: select.ind = list(cos2 = 20)
# - Top 20 contributing individuals: select.ind = list(contrib = 20)
# - Select ind by names: select.ind = list(name = c("23", "42", "119") )
# Example: Select the top 40 according to the cos2
fviz_pca_ind(res.pca, select.ind = list(cos2 = 40))
# Graph of variables
# ++++++++++++++++++++++++++++
# Default plot
fviz_pca_var(res.pca, col.var = "steelblue")+
theme_minimal()
# Control variable colors using their contributions
fviz_pca_var(res.pca, col.var = "contrib")+
scale_color_gradient2(low="white", mid="blue",
high="red", midpoint=96, space = "Lab") +
theme_minimal()
# Select variables with select.var argument
# You can select by contrib, cos2 and name
# as previously described for ind
# Select the top 3 contributing variables
fviz_pca_var(res.pca, select.var = list(contrib = 3))
# Biplot of individuals and variables
# ++++++++++++++++++++++++++
fviz_pca_biplot(res.pca)
# Keep only the labels for variables
# Change the color by groups, add ellipses
fviz_pca_biplot(res.pca, label = "var", habillage=iris$Species,
addEllipses=TRUE, ellipse.level=0.95)+
theme_minimal()
以上是关于r 主成分分析和聚类的主要内容,如果未能解决你的问题,请参考以下文章
主成分分析,聚类分析,因子分析的基本思想以及他们各自的优缺点。
R语言层次聚类(hierarchical clustering):特征缩放抽取hclust中的聚类簇(cutree函数从hclust对象中提取每个聚类簇的成员)基于主成分分析的进行聚类结果可视化