分位数标准化

Posted mengjieli

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了分位数标准化相关的知识,希望对你有一定的参考价值。

 

quantile normalization 原理:

A quick illustration of such normalizing on a very small dataset:

Arrays 1 to 3, genes A to D

A    5    4    3
B    2    1    4
C    3    4    6
D    4    2    8

For each column determine a rank from lowest to highest and assign number i-iv

A    iv    iii   i
B    i     i     ii
C    ii    iii   iii
D    iii   ii    iv

These rank values are set aside to use later. Go back to the first set of data. Rearrange that first set of column values so each column is in order going lowest to highest value. (First column consists of 5,2,3,4. This is rearranged to 2,3,4,5. Second Column 4,1,4,2 is rearranged to 1,2,4,4, and column 3 consisting of 3,4,6,8 stays the same because it is already in order from lowest to highest value.) The result is:

A    5    4    3    becomes A 2 1 3
B    2    1    4    becomes B 3 2 4
C    3    4    6    becomes C 4 4 6
D    4    2    8    becomes D 5 4 8

Now find the mean for each row to determine the ranks

A (2 1 3)/3 = 2.00 = rank i
B (3 2 4)/3 = 3.00 = rank ii
C (4 4 6)/3 = 4.67 = rank iii
D (5 4 8)/3 = 5.67 = rank iv

Now take the ranking order and substitute in new values

A    iv    iii   i
B    i     i     ii
C    ii    iii   iii
D    iii   ii    iv

becomes:

A    5.67    4.67    2.00
B    2.00    2.00    3.00
C    3.00    4.67    4.67
D    4.67    3.00    5.67


R实现方法: 实质上是针对array数据进行设置的,要求数据每一列是一个array,每一行是一个探针
针对分位数标准化,R中有多个包进行处理 1:affy 2: preprocessCore 其中preprocessCore 中的normalize.quantiles使用非常方便
> a<-matrix(1:6,3,2)
> a
     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6
> library(preprocessCore)
> b=normalize.quantiles(a)
> b
     [,1] [,2]
[1,]  2.5  2.5
[2,]  3.5  3.5
[3,]  4.5  4.5

以上是关于分位数标准化的主要内容,如果未能解决你的问题,请参考以下文章

R语言统计函数:均值meanmedian中位数sd标准差var方差mad中位数绝对偏差quantile分位数range范围(起始值结束值)sum加和diff数据差分scale标准化

R语言统计函数:均值meanmedian中位数sd标准差var方差mad中位数绝对偏差quantile分位数range范围(起始值结束值)sum加和diff数据差分scale标准化

平均数 中位数 四分位数 方差 标准差

BigQuery 标准 SQL 中的分位数函数

分位数如何计算?

python求beta分布的分位数