在tapply或R中使用近似函数
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了在tapply或R中使用近似函数相关的知识,希望对你有一定的参考价值。
我有一个用于日期,深度和温度的温度分析器(tp)数据。每个日期的深度并不完全相同,因此我需要将其统一为相同的深度,并通过线性近似为该深度设置温度。我可以使用“ approx”函数循环执行此操作(请参见随附的代码的第一部分)。但是我知道我应该做得更好,不要循环(考虑到我将有大约60万行)。我尝试使用“ by”功能来完成此操作,但未成功将结果(列表)转换为数据框或矩阵(请参见代码的第二部分)。请记住,舍入深度的长度并不总是与示例中的相同。四舍五入的深度在Depth2列中,插入的温度置于Temp2中解决这个问题的“正确”方法是什么?
# create df manually
tp <- data.frame(Date=double(31), Depth=double(31), Temperature=double(31))
tp$Date[1:11] <- '2009-12-17' ; tp$Date[12:22] <- '2009-12-18'; tp$Date[23:31] <- '2009-12-19'
tp$Depth <- c(24.92,25.50,25.88,26.33,26.92,27.41,27.93,28.37,28.82,29.38,29.92,25.07,25.56,26.06,26.54,27.04,27.53,28.03,28.52,29.02,29.50,30.01,25.05,25.55,26.04,26.53,27.02,27.52,28.01,28.53,29.01)
tp$Temperature <- c(19.08,19.06,19.06,18.87,18.67,17.27,16.53,16.43,16.30,16.26,16.22,17.62,17.43,17.11,16.72,16.38,16.28,16.20,16.15,16.13,16.11,16.08,17.54,17.43,17.32,17.14,16.89,16.53,16.28,16.20,16.13)
# create rounded depth column
tp$Depth2 <- round(tp$Depth)
# loop on date to calculate linear approximation for rounded depth
dtgrp <- tp[!duplicated(tp[,1]),1]
for (i in dtgrp)
x1 <- tp[tp$Date == i, "Depth"]
y1 <- tp[tp$Date == i, "Temperature"]
x2 <- tp[tp$Date == i, "Depth2"]
tpa <- approx(x=x1,y=y1,xout=x2, rule=2)
tp[tp$Date == i, "Temp2"] <- tpa$y
# reduce result to rounded depth
tp1 <- tp[!duplicated(tp[,-c(2:3)]),-c(2:3)]
# not part of the question, but the end need is for a matrix, so this complete it:
library(reshape2)
tpbydt <- acast(tp1, Date~Depth2, value.var="Temp2")
# second part: I tried to use the by function (instead of loop) but got lost when tring to convert it to data frame or matrix
rdpth <- function(x1,y1,x2)
tpa <- approx(x=x1,y=y1,xout=x2, rule=2)
return(tpa)
tp2 <- by(tp, tp$Date,function(tp) rdpth(tp$Depth,tp$Temperature,tp$Depth2), simplify = TRUE)
答案
与by
调用非常接近,但请记住它会返回对象列表。因此,考虑在最末端建立要行绑定的数据帧列表:
df_list <- by(tp, tp$Date, function(sub)
tpa <- approx(x=sub$Depth, y=sub$Temperature, xout=sub$Depth2, rule=2)
df <- unique(data.frame(Date = sub$Date,
Depth2 = sub$Depth2,
Temp2 = tpa$y,
stringsAsFactors = FALSE))
return(df)
)
tp2 <- do.call(rbind, unname(df_list))
tp2
# Date Depth2 Temp2
# 1 2009-12-17 25 19.07724
# 2 2009-12-17 26 19.00933
# 5 2009-12-17 27 18.44143
# 7 2009-12-17 28 16.51409
# 9 2009-12-17 29 16.28714
# 11 2009-12-17 30 16.22000
# 12 2009-12-18 25 17.62000
# 21 2009-12-18 26 17.14840
# 4 2009-12-18 27 16.40720
# 6 2009-12-18 28 16.20480
# 8 2009-12-18 29 16.13080
# 10 2009-12-18 30 16.08059
# 13 2009-12-19 25 17.54000
# 22 2009-12-19 26 17.32898
# 41 2009-12-19 27 16.90020
# 61 2009-12-19 28 16.28510
# 81 2009-12-19 29 16.13146
并且如果您重置row.names
,则与tp1
输出完全相同:
identical(data.frame(tp1, row.names = NULL),
data.frame(tp2, row.names = NULL))
# [1] TRUE
以上是关于在tapply或R中使用近似函数的主要内容,如果未能解决你的问题,请参考以下文章
如何在 R 或 RStudio 中的 apply() 函数中嵌套 quantile() 函数
R中的高效批量处理函数(lapply sapply apply tapply mapply)(转)