在 R 中存储模拟结果
Posted
技术标签:
【中文标题】在 R 中存储模拟结果【英文标题】:Storing simulation results in R 【发布时间】:2018-07-11 20:33:30 【问题描述】:我想估计 Mantel-Haenszel 差分项目功能 (DIF) 优势比和 HMDDIF 指数。我写了下面的函数。在我看来,我在存储结果时犯了一个错误。你能看看这个并给我反馈吗? 这是示例数据:
# generate dataset
r <- 1000
c <- 16
test <- matrix(rbinom(r*c,1,0.5),r,c)
# create sum scores for each student using first 15 columns
test <- cbind(test, apply(test[,1:15],1,sum))
colnames(test) <- c("v1","v2","v3","v4","v5","v6","v7","v8","v9","v10","v11","v12","v13","v14","v15","group","score")
test <- as.data.frame(test)
前 15 列是学生对项目/问题的正确/错误回答。组成员列是第 16 列。学生“分数”变量是最后(第 17 列)项目分数的总和。该公式可以在我从*** (https://en.wikipedia.org/wiki/Differential_item_functioning) 获得的图片中找到。
对于每个分数类别,我想估计这张图片中的最后两个公式。行是 10 个学生,列是六个项目/问题。同样,第 16 列是组成员(1-focal,0-reference) 这是我的功能代码。
library(dplyr)
# this function first starts with the first item and loop k scores from 1-15. Then move to the second item.
# data should only contain the items, grouping variable, and person score.
Mantel.Haenszel <- function (data)
# browser() #runs with debug
for (item in 1:15) #item loop not grouping/scoring
item.incorrect <- data[,item] == 0
item.correct <- data[,item] == 1
Results <- c()
for (k in 1:15) # for k scores
Ak <- nrow(filter(data, score == k, group == 0, item.correct)) # freq of ref group & correct
Bk <- nrow(filter(data, score == k, group == 0, item.incorrect)) # freq of ref group & incorrect
Ck <- nrow(filter(data, score == k, group == 1, item.correct)) # freq of foc group & correct
Dk <- nrow(filter(data, score == k, group == 1, item.incorrect)) # freq of foc group & incorrect
nrk <- nrow(filter(data, score == k, group == 0)) #sample size for ref
nfk <- nrow(filter(data, score == k, group == 1)) #sample size for focal
if (Bk == 0 | Ck == 0)
next
nominator <-sum((Ak*Dk)/(nrk + nfk))
denominator <-sum((Bk*Ck)/(nrk + nfk))
odds.ratio <- nominator/denominator
if (odds.ratio == 0)
next
MH.D.DIF <- (-2.35)*log(odds.ratio) #index
# save the output
out <- list("Odds Ratio" = odds.ratio, "MH Diff" = MH.D.DIF)
results <- rbind(Results, out)
return(results)
# close score loop
# close item loop
#close function
这是我得到的
# test funnction
Mantel.Haenszel(test)
> Mantel.Haenszel(test)
Odds Ratio MH Diff
out 0.2678571 3.095659
我想要得到的是
> Mantel.Haenszel(test)
Odds Ratio MH Diff
out 0.2678571 3.095659
## ##
.. ..
(15 rows here for 15 score categories in the dataset)
【问题讨论】:
【参考方案1】:您是否应该期望item
和k
的每个组合都有一个结果,最大输出行数为225,除非有break
的任何实例?如果是这样,我认为你只需要改变一些小事情。首先,只在函数开头声明一次Results
。然后,确保您是 rbind
-ing 并返回 Results
或 results, but not both. Then, move your
return 到您的实际函数级别而不是循环。在下面的示例中,我还包含了当前的 item
和 k
用于演示:
Mantel.Haenszel <- function (data)
# browser() #runs with debug
Results <- c()
for (item in 1:15)
#item loop not grouping/scoring
item.incorrect <- data[, item] == 0
item.correct <- data[, item] == 1
for (k in 1:15)
# for k scores
Ak <-
nrow(filter(data, score == k, group == 0, item.correct)) # freq of ref group & correct
Bk <-
nrow(filter(data, score == k, group == 0, item.incorrect)) # freq of ref group & incorrect
Ck <-
nrow(filter(data, score == k, group == 1, item.correct)) # freq of foc group & correct
Dk <-
nrow(filter(data, score == k, group == 1, item.incorrect)) # freq of foc group & incorrect
nrk <-
nrow(filter(data, score == k, group == 0)) #sample size for ref
nfk <-
nrow(filter(data, score == k, group == 1)) #sample size for focal
if (Bk == 0 | Ck == 0)
next
nominator <- sum((Ak * Dk) / (nrk + nfk))
denominator <- sum((Bk * Ck) / (nrk + nfk))
odds.ratio <- nominator / denominator
if (odds.ratio == 0)
next
MH.D.DIF <- (-2.35) * log(odds.ratio) #index
# save the output
out <-
list(
item = item,
k = k,
"Odds Ratio" = odds.ratio,
"MH Diff" = MH.D.DIF
)
Results <- rbind(Results, out)
# close score loop
# close item loop
return(Results)
#close function
test.output <- Mantel.Haenszel(test)
给出如下输出:
> head(test.output, 20)
item k Odds Ratio MH Diff
out 1 3 2 -1.628896
out 1 4 4.666667 -3.620046
out 1 5 0.757085 0.6539573
out 1 6 0.5823986 1.27041
out 1 7 0.9893293 0.02521097
out 1 8 1.078934 -0.1785381
out 1 9 1.006237 -0.01461145
out 1 10 1.497976 -0.9496695
out 1 11 1.435897 -0.8502066
out 1 12 1.5 -0.952843
out 2 3 0.8333333 0.4284557
out 2 4 2.424242 -2.08097
out 2 5 1.368664 -0.7375117
out 2 6 1.222222 -0.4715761
out 2 7 0.6288871 1.089938
out 2 8 1.219512 -0.4663597
out 2 9 1 0
out 2 10 2.307692 -1.965183
out 2 11 0.6666667 0.952843
out 2 12 0.375 2.304949
这就是你要找的吗?
【讨论】:
嗨卢克,你的修改解决了这个问题。感谢您的宝贵时间!以上是关于在 R 中存储模拟结果的主要内容,如果未能解决你的问题,请参考以下文章
在 Azure 持久函数上调用 CreateCheckStatusResponse 时,Azurite 未给出与 Azure 存储模拟器相同的结果