/**
* @author xubo
* ref:Spark MlLib机器学习实战
* more code:https://github.com/xubo245/SparkLearning
* more blog:http://blog.csdn.net/xubo245
*/package org.apache.spark.mllib.learning.basic
import org.apache.spark.mllib.linalg.Matrix, Matrices, Vectors
import org.apache.spark.mllib.stat.Statistics
import org.apache.spark.SparkConf, SparkContext
/**
* Created by xubo on 2016/5/23.
*/objectChiSqLearningdef main(args: Array[String])
val vd = Vectors.dense(1, 2, 3, 4, 5)
val vdResult = Statistics.chiSqTest(vd)
println(vd)
println(vdResult)
println("-------------------------------")
val mtx = Matrices.dense(3, 2, Array(1, 3, 5, 2, 4, 6))
val mtxResult = Statistics.chiSqTest(mtx)
println(mtx)
println(mtxResult)
//print :方法、自由度、方法的统计量、p值
println("-------------------------------")
val mtx2 = Matrices.dense(2, 2, Array(19.0, 34, 24, 10.0))
printChiSqTest(mtx2)
printChiSqTest( Matrices.dense(2, 2, Array(26.0, 36, 7, 2.0)))
// val mtxResult2 = Statistics.chiSqTest(mtx2)// println(mtx2)// println(mtxResult2)def printChiSqTest(matrix: Matrix): Unit =
println("-------------------------------")
val mtxResult2 = Statistics.chiSqTest(matrix)
println(matrix)
println(mtxResult2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
3.结果:
[1.0,2.0,3.0,4.0,5.0]
Chi squared test summary:
method: pearson
degrees of freedom = 4
statistic = 3.333333333333333
pValue = 0.5036682742334986
No presumption against null hypothesis: observed follows the same distribution as expected..
-------------------------------
1.02.03.04.05.06.0
Chi squared test summary:
method: pearson
degrees of freedom = 2
statistic = 0.14141414141414144
pValue = 0.931734784568187
No presumption against null hypothesis: the occurrence of the outcomes is statistically independent..
-------------------------------
-------------------------------
19.024.034.010.0
Chi squared test summary:
method: pearson
degrees of freedom = 1
statistic = 9.999815802502738
pValue = 0.0015655588405594223
Very strong presumption against null hypothesis: the occurrence of the outcomes is statistically independent..
-------------------------------
26.07.036.02.0
Chi squared test summary:
method: pearson
degrees of freedom = 1
statistic = 4.05869675818742
pValue = 0.043944401832082036Strong presumption against null hypothesis: the occurrence of the outcomes is statistically independent..