从矩阵SAS中选择 最高相关对
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了从矩阵SAS中选择 最高相关对相关的知识,希望对你有一定的参考价值。
我有这样的数据集
data have;
do i = 1 to 1000;
y = ranuni(0);
x1 = y ** 2;
x2 = x1 ** 3;
x3 = x2 - x1/2;
output;
end;
run;
我建立了一个像这样的相关矩阵:
proc corr
data = have
out = correlation_matrix
(where = (_TYPE_ = "CORR"))
noprint;
run;
我试图大声思考一些代码,这些代码可以实现类似于我正在寻找的东西,语法和逻辑是正确的,但我正在描述我正在寻找的东西
proc sort
data = correlation_matrix
by _NAME_;
run;
data _temp;
set correlation_matrix;
array col[*] _numeric_;
by _NAME_;
do i = 1 to dim(col);
if col(i) > 0.6 then do;
%let list = append(vname(col));
end;
run;
从相关矩阵,我正在寻找一种方法来返回相关性为60%或高于某个阈值的对,我将用这些对构建这样的散点图/直方图矩阵
proc contents;
data = high_correlation_pairs
out = contents
noprint;
run;
proc sort
data = contents
nodupkey;
by name;
run;
proc sql noprint;
select name INTO: highly_correlated_pairs
separated by " "
from contents
;
quit;
ODS GRAPHICS /
IMAGEMAP=OFF;
OPTIONS VALIDVARNAME=ANY;
PROC SGSCATTER
DATA=have;
TITLE "Scatter Plot Matrix";
FOOTNOTE;
MATRIX &highly_correlated_pairs
/
DIAGONAL=(HISTOGRAM )
START=TOPLEFT
NOLEGEND
;
RUN;
TITLE; FOOTNOTE;
我只是不确定如何从矩阵中选择具有一对超过60%相关性的变量,甚至可以通过NAME返回corr超过60%的列
答案
你可以得到这样的对 - 关键是vname
函数,它返回一个数组元素的名称:
data high_corrs;
set correlation_matrix;
array coefs i--x3;
length var1 var2 $32.;
do j = 1 to dim(coefs);
corr = coefs(j);
if _n_ < j and corr > 0.6 then do;
var1 = vname(coefs(_n_));
var2 = vname(coefs(j));
output;
end;
end;
keep var1 var2 corr;
run;
也许从那里你可以解决剩下的问题?
另一答案
编辑:包含完整答案:
PROC TRANSPOSE用于将相关矩阵转换为x,y对和子集到感兴趣的相关性。创建一个宏变量以在PROC SGSCATTER中使用。
注意:PLOTREQUESTS = x1 * x2 x1 * y x2 * x3 x2 * y
data have;
do i = 1 to 1000;
y = ranuni(0);
x1 = y ** 2;
x2 = x1 ** 3;
x3 = x2 - x1/2;
output;
end;
run;
proc corr data=have out=corr noprint;
run;
proc transpose name=with data=corr out=pair(where=(.6 le abs(col1) lt 1));
where _type_ eq 'CORR';
by _name_ notsorted;
run;
data pairV / view=pairv;
set pair;
call sortc(_name_,with);
run;
proc sort data=pairv out=pair2 nodupkey;
by _name_ with;
run;
proc sql noprint;
select catx('*',_name_,with) into :plotrequests separated by ' ' from pair2;
quit;
%put NOTE: &=plotrequests;
proc sgscatter data=have;
plot &plotrequests;
run;
quit;
以上是关于从矩阵SAS中选择 最高相关对的主要内容,如果未能解决你的问题,请参考以下文章