高维数据 |R语言数据可视化之t-SNE
Posted 菜鸟学数据分析之R语言
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了高维数据 |R语言数据可视化之t-SNE相关的知识,希望对你有一定的参考价值。
高维数据可视化之t-SNE算法
t-SNE算法是最近开发的一种降维的非线性算法,也是一种机器学习算法。与PCA一样是非常适合将高维度数据降低至二维或三维的一种方法,不同之处是PCA属于线性降维,不能解释复杂多项式之间的关系,而t-SNE是根据t分布随机领域的嵌入找到数据之间的结构特点。
01
原始数据
#原始数据为iris数据框,是来自鸢尾属、花斑科和维珍属的50朵花的萼片长度和宽度以及花瓣长度和宽度的测量值,包含150行,5个变量的部分数据截图如下:
02
降维处理
iris_unique<-unique(iris)
#去除重复值
set.seed(42)
iris1<-as.matrix(iris_unique[,1:4])
#选取1至4列数据构成矩阵。
tsne_out<-Rtsne(iris1)
#c++实现Barnes-Hut t-分布式随机邻居嵌入的封装器,
0.0可以计算出t-SNE的准确值, =
降维全靠Rtsne()函数。
tsne_out
$N
[1] 149
$Y
[,2]
-15.794362 -6.776711
-18.120432 -6.231470
-18.261085 -7.311696
-18.520943 -7.087130
-15.778549 -7.221474
-14.008673 -7.269801
-17.893455 -7.813756
-16.467086 -6.810327
-19.221721 -6.978521
-17.803392 -6.466310
-14.445038 -6.594416
-17.100410 -7.344027
-18.414497 -6.464928
-19.408668 -7.464577
-13.275994 -6.701498
-13.149006 -7.122149
-13.890901 -6.929687
-15.771478 -6.789918
-13.700649 -6.432417
-14.852568 -7.405885
-15.093413 -5.868684
-15.141230 -7.327606
-17.943436 -8.557303
-16.259062 -5.898315
-16.997972 -7.907640
-17.825508 -5.984987
-16.383265 -6.823896
-15.438452 -6.609476
-15.772932 -6.342601
-17.844418 -7.233536
-17.907978 -6.804983
-15.149819 -5.939007
-13.898201 -7.644515
-13.421065 -7.250096
-17.747953 -6.510444
-17.257790 -6.137830
-14.564020 -6.003143
-16.193628 -7.512172
-19.165844 -7.239552
-16.147220 -6.566834
-16.104934 -7.127514
-19.608322 -6.371928
-18.891659 -7.666698
-15.795457 -7.824185
-14.644010 -8.070843
-18.380163 -6.474333
-14.788982 -7.480182
-18.361689 -7.377014
-14.677585 -6.777023
-16.850026 -6.588789
8.536314 1.331853
7.160654 2.222958
8.845708 1.642313
2.600714 2.678237
7.608379 2.089299
4.581609 3.175618
7.227995 2.925564
1.259600 2.278412
7.658976 1.699396
2.519998 3.124515
1.356245 2.444995
4.855057 2.662532
3.194511 1.282723
6.573616 2.933744
2.654406 1.830627
7.646466 1.447916
4.740423 3.566360
3.545194 2.042330
5.763106 1.039076
2.780636 2.251854
7.123596 4.273298
4.397361 1.856958
8.586749 3.297060
6.268760 2.630621
6.569283 1.767931
7.376409 1.575341
8.456851 1.726707
9.372167 2.305879
5.871143 2.931851
2.242621 1.746196
2.373290 2.304550
2.174285 2.161862
3.296379 2.057194
8.559505 4.180703
4.351523 3.790885
6.254539 3.630555
8.222644 1.732763
5.693089 1.089601
3.861217 2.855112
2.883691 2.648004
3.519009 3.300865
6.352661 2.839296
3.341238 2.093332
1.310473 2.266729
3.544698 2.882386
4.147667 2.748773
4.098212 2.762276
5.786155 2.165470
1.204406 2.147812
3.726652 2.550643
12.868885 6.006482
8.275202 4.992602
13.818242 4.538085
11.395916 3.733291
12.611549 4.518063
15.167288 4.580621
3.256931 4.343911
14.657778 4.221338
12.606503 3.429156
14.293539 5.431041
10.913678 4.848214
10.620091 3.985814
12.455521 4.586778
8.165304 5.279075
8.537106 5.643321
11.603757 5.419617
11.522072 3.952109
15.250025 5.271586
15.472077 4.396878
8.917063 4.108094
13.199716 5.045357
7.799535 5.271964
15.305558 4.422065
8.684753 3.689291
12.898437 4.948861
14.177381 4.339674
8.090627 3.742130
7.812402 4.082107
11.927399 4.059812
14.012217 3.959194
14.560938 4.206181
15.236393 5.268181
12.010499 4.206740
8.927838 3.385948
10.179020 3.447383
14.868648 4.698355
12.240744 5.918564
11.418754 3.982833
7.492590 4.141120
12.415297 4.801756
12.650368 5.203968
11.761565 5.179988
13.361707 5.127742
12.955958 5.557022
11.713405 5.048448
9.150078 4.010551
11.016824 4.559839
11.759284 5.917160
[149,] 8.022603 4.638540
$costs
-6.569291e-05 -1.184407e-04 6.668903e-05 -4.284391e-04 1.331527e-05 -1.659447e-04
4.178618e-04 4.000721e-04 -1.471709e-04 -4.866961e-04 1.583424e-04 3.891458e-04
-2.748767e-04 1.038017e-05 -1.223165e-04 -2.175199e-04 -3.052060e-04 -1.153663e-04
-5.807774e-06 -1.318655e-05 1.086358e-04 -1.056506e-04 1.042230e-03 1.020565e-03
8.145448e-05 8.433778e-05 4.999846e-05 8.485467e-05 1.840803e-04 -3.276016e-04
-6.551815e-04 1.557659e-04 -4.736972e-06 -4.735442e-05 -2.520546e-04 3.108438e-04
1.260465e-04 1.972264e-04 -9.611292e-05 2.441147e-04 -1.997326e-04 -7.527818e-05
1.671165e-04 5.584101e-04 4.285945e-04 -2.968739e-04 -1.330250e-05 -1.149360e-04
8.084874e-05 6.710403e-04 6.919283e-04 9.766942e-04 1.960186e-03 -1.673100e-04
1.498811e-03 1.518804e-03 1.191833e-03 3.496391e-04 8.389463e-04 7.130234e-04
-5.238504e-05 1.329515e-03 1.051728e-03 2.541084e-03 -1.976349e-04 5.621641e-04
1.889274e-03 1.434286e-05 4.027971e-03 -1.166053e-04 1.719625e-03 1.638894e-03
2.473756e-03 1.286735e-03 1.833584e-03 8.951562e-04 6.807228e-04 4.984784e-03
2.823550e-03 1.089342e-04 -6.601256e-06 6.403744e-05 1.991636e-04 1.613454e-03
1.260980e-03 9.531492e-04 1.349265e-03 2.888901e-03 -1.345512e-04 2.987155e-06
5.545848e-04 2.071556e-03 3.427912e-04 3.104291e-04 5.055473e-04 2.853883e-05
5.868188e-04 2.521142e-03 5.766867e-04 3.513576e-04 3.311204e-04 1.479315e-03
1.033390e-03 1.737451e-03 5.759593e-04 2.587490e-04 1.289787e-03 4.355705e-04
1.261498e-03 4.912577e-04 3.050430e-03 3.313013e-03 9.174032e-04 1.021039e-03
1.433732e-03 1.278868e-03 1.815971e-03 2.059448e-04 5.057746e-05 1.561998e-03
5.342262e-04 1.744473e-03 2.214039e-04 2.083725e-03 3.547608e-04 5.196140e-04
1.990998e-03 2.346313e-03 1.098786e-03 8.136133e-04 5.001043e-04 1.644533e-04
6.233415e-04 2.139292e-03 8.210858e-04 -8.386480e-05 7.858205e-04 1.427453e-03
2.148709e-03 6.094865e-04 1.929748e-04 -8.357979e-05 6.223272e-04 3.127318e-04
[145] 2.927624e-04 1.391081e-03 3.127062e-03 1.176773e-03 1.645225e-03
$itercosts
43.7514985 44.7873147 44.8116650 44.3887944 45.7282669 0.3704256 0.1252816 0.1237133
0.1217102 0.1200852 0.1187576 0.1161445 0.1173155 0.1144428 0.1127897 0.1122483
[17] 0.1129056 0.1116092 0.1111795 0.1105687
$origD
[1] 4
$perplexity
[1] 30
$theta
[1] 0.5
$max_iter
[1] 1000
$stop_lying_iter
[1] 250
$mom_switch_iter
[1] 250
$momentum
[1] 0.5
$final_momentum
[1] 0.8
$eta
[1] 200
$exaggeration_factor
12
data<-data.frame(tsne_out$Y,iris_unique$Species)
data
X1 X2 iris_unique.Species
1 -15.794362 -6.776711 setosa
2 -18.120432 -6.231470 setosa
3 -18.261085 -7.311696 setosa
4 -18.520943 -7.087130 setosa
5 -15.778549 -7.221474 setosa
6 -14.008673 -7.269801 setosa
7 -17.893455 -7.813756 setosa
8 -16.467086 -6.810327 setosa
9 -19.221721 -6.978521 setosa
10 -17.803392 -6.466310 setosa
11 -14.445038 -6.594416 setosa
12 -17.100410 -7.344027 setosa
13 -18.414497 -6.464928 setosa
14 -19.408668 -7.464577 setosa
15 -13.275994 -6.701498 setosa
16 -13.149006 -7.122149 setosa
17 -13.890901 -6.929687 setosa
18 -15.771478 -6.789918 setosa
19 -13.700649 -6.432417 setosa
20 -14.852568 -7.405885 setosa
21 -15.093413 -5.868684 setosa
22 -15.141230 -7.327606 setosa
23 -17.943436 -8.557303 setosa
24 -16.259062 -5.898315 setosa
25 -16.997972 -7.907640 setosa
26 -17.825508 -5.984987 setosa
27 -16.383265 -6.823896 setosa
28 -15.438452 -6.609476 setosa
29 -15.772932 -6.342601 setosa
30 -17.844418 -7.233536 setosa
31 -17.907978 -6.804983 setosa
32 -15.149819 -5.939007 setosa
33 -13.898201 -7.644515 setosa
34 -13.421065 -7.250096 setosa
35 -17.747953 -6.510444 setosa
36 -17.257790 -6.137830 setosa
37 -14.564020 -6.003143 setosa
38 -16.193628 -7.512172 setosa
39 -19.165844 -7.239552 setosa
40 -16.147220 -6.566834 setosa
41 -16.104934 -7.127514 setosa
42 -19.608322 -6.371928 setosa
43 -18.891659 -7.666698 setosa
44 -15.795457 -7.824185 setosa
45 -14.644010 -8.070843 setosa
46 -18.380163 -6.474333 setosa
47 -14.788982 -7.480182 setosa
48 -18.361689 -7.377014 setosa
49 -14.677585 -6.777023 setosa
50 -16.850026 -6.588789 setosa
51 8.536314 1.331853 versicolor
52 7.160654 2.222958 versicolor
53 8.845708 1.642313 versicolor
54 2.600714 2.678237 versicolor
55 7.608379 2.089299 versicolor
56 4.581609 3.175618 versicolor
57 7.227995 2.925564 versicolor
58 1.259600 2.278412 versicolor
59 7.658976 1.699396 versicolor
60 2.519998 3.124515 versicolor
61 1.356245 2.444995 versicolor
62 4.855057 2.662532 versicolor
63 3.194511 1.282723 versicolor
64 6.573616 2.933744 versicolor
65 2.654406 1.830627 versicolor
66 7.646466 1.447916 versicolor
67 4.740423 3.566360 versicolor
68 3.545194 2.042330 versicolor
69 5.763106 1.039076 versicolor
70 2.780636 2.251854 versicolor
71 7.123596 4.273298 versicolor
72 4.397361 1.856958 versicolor
73 8.586749 3.297060 versicolor
74 6.268760 2.630621 versicolor
75 6.569283 1.767931 versicolor
76 7.376409 1.575341 versicolor
77 8.456851 1.726707 versicolor
78 9.372167 2.305879 versicolor
79 5.871143 2.931851 versicolor
80 2.242621 1.746196 versicolor
81 2.373290 2.304550 versicolor
82 2.174285 2.161862 versicolor
83 3.296379 2.057194 versicolor
84 8.559505 4.180703 versicolor
85 4.351523 3.790885 versicolor
86 6.254539 3.630555 versicolor
87 8.222644 1.732763 versicolor
88 5.693089 1.089601 versicolor
89 3.861217 2.855112 versicolor
90 2.883691 2.648004 versicolor
91 3.519009 3.300865 versicolor
92 6.352661 2.839296 versicolor
93 3.341238 2.093332 versicolor
94 1.310473 2.266729 versicolor
95 3.544698 2.882386 versicolor
96 4.147667 2.748773 versicolor
97 4.098212 2.762276 versicolor
98 5.786155 2.165470 versicolor
99 1.204406 2.147812 versicolor
100 3.726652 2.550643 versicolor
101 12.868885 6.006482 virginica
102 8.275202 4.992602 virginica
103 13.818242 4.538085 virginica
104 11.395916 3.733291 virginica
105 12.611549 4.518063 virginica
106 15.167288 4.580621 virginica
107 3.256931 4.343911 virginica
108 14.657778 4.221338 virginica
109 12.606503 3.429156 virginica
110 14.293539 5.431041 virginica
111 10.913678 4.848214 virginica
112 10.620091 3.985814 virginica
113 12.455521 4.586778 virginica
114 8.165304 5.279075 virginica
115 8.537106 5.643321 virginica
116 11.603757 5.419617 virginica
117 11.522072 3.952109 virginica
118 15.250025 5.271586 virginica
119 15.472077 4.396878 virginica
120 8.917063 4.108094 virginica
121 13.199716 5.045357 virginica
122 7.799535 5.271964 virginica
123 15.305558 4.422065 virginica
124 8.684753 3.689291 virginica
125 12.898437 4.948861 virginica
126 14.177381 4.339674 virginica
127 8.090627 3.742130 virginica
128 7.812402 4.082107 virginica
129 11.927399 4.059812 virginica
130 14.012217 3.959194 virginica
131 14.560938 4.206181 virginica
132 15.236393 5.268181 virginica
133 12.010499 4.206740 virginica
134 8.927838 3.385948 virginica
135 10.179020 3.447383 virginica
136 14.868648 4.698355 virginica
137 12.240744 5.918564 virginica
138 11.418754 3.982833 virginica
139 7.492590 4.141120 virginica
140 12.415297 4.801756 virginica
141 12.650368 5.203968 virginica
142 11.761565 5.179988 virginica
143 13.361707 5.127742 virginica
144 12.955958 5.557022 virginica
145 11.713405 5.048448 virginica
146 9.150078 4.010551 virginica
147 11.016824 4.559839 virginica
148 11.759284 5.917160 virginica
149 8.022603 4.638540 virginica
colnames(data)<-c("Y1","Y2","Species")
data
Y1 Y2 Species
1 -15.794362 -6.776711 setosa
2 -18.120432 -6.231470 setosa
3 -18.261085 -7.311696 setosa
4 -18.520943 -7.087130 setosa
5 -15.778549 -7.221474 setosa
6 -14.008673 -7.269801 setosa
7 -17.893455 -7.813756 setosa
8 -16.467086 -6.810327 setosa
9 -19.221721 -6.978521 setosa
10 -17.803392 -6.466310 setosa
11 -14.445038 -6.594416 setosa
12 -17.100410 -7.344027 setosa
13 -18.414497 -6.464928 setosa
14 -19.408668 -7.464577 setosa
15 -13.275994 -6.701498 setosa
16 -13.149006 -7.122149 setosa
17 -13.890901 -6.929687 setosa
18 -15.771478 -6.789918 setosa
19 -13.700649 -6.432417 setosa
20 -14.852568 -7.405885 setosa
21 -15.093413 -5.868684 setosa
22 -15.141230 -7.327606 setosa
23 -17.943436 -8.557303 setosa
24 -16.259062 -5.898315 setosa
25 -16.997972 -7.907640 setosa
26 -17.825508 -5.984987 setosa
27 -16.383265 -6.823896 setosa
28 -15.438452 -6.609476 setosa
29 -15.772932 -6.342601 setosa
30 -17.844418 -7.233536 setosa
31 -17.907978 -6.804983 setosa
32 -15.149819 -5.939007 setosa
33 -13.898201 -7.644515 setosa
34 -13.421065 -7.250096 setosa
35 -17.747953 -6.510444 setosa
36 -17.257790 -6.137830 setosa
37 -14.564020 -6.003143 setosa
38 -16.193628 -7.512172 setosa
39 -19.165844 -7.239552 setosa
40 -16.147220 -6.566834 setosa
41 -16.104934 -7.127514 setosa
42 -19.608322 -6.371928 setosa
43 -18.891659 -7.666698 setosa
44 -15.795457 -7.824185 setosa
45 -14.644010 -8.070843 setosa
46 -18.380163 -6.474333 setosa
47 -14.788982 -7.480182 setosa
48 -18.361689 -7.377014 setosa
49 -14.677585 -6.777023 setosa
50 -16.850026 -6.588789 setosa
51 8.536314 1.331853 versicolor
52 7.160654 2.222958 versicolor
53 8.845708 1.642313 versicolor
54 2.600714 2.678237 versicolor
55 7.608379 2.089299 versicolor
56 4.581609 3.175618 versicolor
57 7.227995 2.925564 versicolor
58 1.259600 2.278412 versicolor
59 7.658976 1.699396 versicolor
60 2.519998 3.124515 versicolor
61 1.356245 2.444995 versicolor
62 4.855057 2.662532 versicolor
63 3.194511 1.282723 versicolor
64 6.573616 2.933744 versicolor
65 2.654406 1.830627 versicolor
66 7.646466 1.447916 versicolor
67 4.740423 3.566360 versicolor
68 3.545194 2.042330 versicolor
69 5.763106 1.039076 versicolor
70 2.780636 2.251854 versicolor
71 7.123596 4.273298 versicolor
72 4.397361 1.856958 versicolor
73 8.586749 3.297060 versicolor
74 6.268760 2.630621 versicolor
75 6.569283 1.767931 versicolor
76 7.376409 1.575341 versicolor
77 8.456851 1.726707 versicolor
78 9.372167 2.305879 versicolor
79 5.871143 2.931851 versicolor
80 2.242621 1.746196 versicolor
81 2.373290 2.304550 versicolor
82 2.174285 2.161862 versicolor
83 3.296379 2.057194 versicolor
84 8.559505 4.180703 versicolor
85 4.351523 3.790885 versicolor
86 6.254539 3.630555 versicolor
87 8.222644 1.732763 versicolor
88 5.693089 1.089601 versicolor
89 3.861217 2.855112 versicolor
90 2.883691 2.648004 versicolor
91 3.519009 3.300865 versicolor
92 6.352661 2.839296 versicolor
93 3.341238 2.093332 versicolor
94 1.310473 2.266729 versicolor
95 3.544698 2.882386 versicolor
96 4.147667 2.748773 versicolor
97 4.098212 2.762276 versicolor
98 5.786155 2.165470 versicolor
99 1.204406 2.147812 versicolor
100 3.726652 2.550643 versicolor
101 12.868885 6.006482 virginica
102 8.275202 4.992602 virginica
103 13.818242 4.538085 virginica
104 11.395916 3.733291 virginica
105 12.611549 4.518063 virginica
106 15.167288 4.580621 virginica
107 3.256931 4.343911 virginica
108 14.657778 4.221338 virginica
109 12.606503 3.429156 virginica
110 14.293539 5.431041 virginica
111 10.913678 4.848214 virginica
112 10.620091 3.985814 virginica
113 12.455521 4.586778 virginica
114 8.165304 5.279075 virginica
115 8.537106 5.643321 virginica
116 11.603757 5.419617 virginica
117 11.522072 3.952109 virginica
118 15.250025 5.271586 virginica
119 15.472077 4.396878 virginica
120 8.917063 4.108094 virginica
121 13.199716 5.045357 virginica
122 7.799535 5.271964 virginica
123 15.305558 4.422065 virginica
124 8.684753 3.689291 virginica
125 12.898437 4.948861 virginica
126 14.177381 4.339674 virginica
127 8.090627 3.742130 virginica
128 7.812402 4.082107 virginica
129 11.927399 4.059812 virginica
130 14.012217 3.959194 virginica
131 14.560938 4.206181 virginica
132 15.236393 5.268181 virginica
133 12.010499 4.206740 virginica
134 8.927838 3.385948 virginica
135 10.179020 3.447383 virginica
136 14.868648 4.698355 virginica
137 12.240744 5.918564 virginica
138 11.418754 3.982833 virginica
139 7.492590 4.141120 virginica
140 12.415297 4.801756 virginica
141 12.650368 5.203968 virginica
142 11.761565 5.179988 virginica
143 13.361707 5.127742 virginica
144 12.955958 5.557022 virginica
145 11.713405 5.048448 virginica
146 9.150078 4.010551 virginica
147 11.016824 4.559839 virginica
148 11.759284 5.917160 virginica
149 8.022603 4.638540 virginica
03
ggplot2绘图
>ggplot(data,aes(Y1,Y2,fill=Species))+geom_point(size=5.5,colour="black",alpha=0.6,shape=21)+scale_fill_manual(values=c("#00AFBB","#E7B800","blue"))
小结
Rtsne():给定输入对象之间的距离矩阵D(默认情况下是两个对象之间的欧氏距离),计算原始空间p_ij中的相似度评分,输入对象必须为矩阵!!
t-SNE的局限性:若原始数据本身具有很高的维度,是不可能完整映射到二或三维空间,而且在t-SNE图中,距离本身是没有意义的,涉及概率分布问题。
我知道你 在看 哦
以上是关于高维数据 |R语言数据可视化之t-SNE的主要内容,如果未能解决你的问题,请参考以下文章
R语言plotly可视化:使用TSNE算法将数据降维到二维并使用plotly可视化降维后的数据(project data into 2D with t-SNE and px.scatter)