模型为每行创建 1 个模型
Posted
技术标签:
【中文标题】模型为每行创建 1 个模型【英文标题】:Model creating 1 model for each row 【发布时间】:2022-01-21 02:52:39 【问题描述】:我有一个时间序列,我想用它创建一个回归模型,时间序列如下所示:
Date Value PREDICTOR1 PREDICTOR2 PREDICTOR3 PREDICTOR4 PREDICTOR5 PREDICTOR6 PREDICTOR7 PREDICTOR8 PREDICTOR9 PREDICTOR10 PREDICTOR11 PREDICTOR12
<date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2021-09-02 74 0.1 3.7 3.8 0.6 1.5 63.2 2.6 -51900 1.6
2 2021-09-03 74.4 0.1 3.7 3.8 0.6 1.5 63.2 2.6 -51900 1.6
3 2021-09-07 73.9 0.1 3.7 3.8 0.6 1.5 63.2 2.6 -51900 1.6
4 2021-09-08 73.7 0.1 3.7 3.8 0.6 1.5 63.2 2.6 -51900 1.6
5 2021-09-09 73.8 0.1 3.7 3.8 0.6 1.5 63.2 2.6 -51900 1.6
6 2021-09-10 73.7 0.1 3.7 3.8 0.6 1.5 63.2 2.6 -51900 1.6
从中我训练了一个模型:
fit <- df %>%
model(
tslm = TSLM(Value ~ PREDICTOR1+ PREDICTOR2+ PREDICTOR3+ PREDICTOR4+ PREDICTOR5 +PREDICTOR6+ PREDICTOR7+ PREDICTOR8 +PREDICTOR9 +PREDICTOR10 +PREDICTOR11 +PREDICTOR12)
)
但我收到报告的结果:
> report(fit)
# A tibble: 3,409 x 16
id .model r_squared adj_r_squared sigma2 statistic p_value df log_lik AIC AICc BIC CV deviance df.residual rank
<int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int>
1 1 tslm NaN NaN NaN NaN NaN 1 Inf -Inf -Inf -Inf NaN 0 0 1
2 2 tslm NaN NaN NaN NaN NaN 1 Inf -Inf -Inf -Inf NaN 0 0 1
3 3 tslm NaN NaN NaN NaN NaN 1 Inf -Inf -Inf -Inf NaN 0 0 1
4 4 tslm NaN NaN NaN NaN NaN 1 Inf -Inf -Inf -Inf NaN 0 0 1
5 5 tslm NaN NaN NaN NaN NaN 1 Inf -Inf -Inf -Inf NaN 0 0 1
6 6 tslm NaN NaN NaN NaN NaN 1 Inf -Inf -Inf -Inf NaN 0 0 1
7 7 tslm NaN NaN NaN NaN NaN 1 Inf -Inf -Inf -Inf NaN 0 0 1
8 8 tslm NaN NaN NaN NaN NaN 1 Inf -Inf -Inf -Inf NaN 0 0 1
9 9 tslm NaN NaN NaN NaN NaN 1 Inf -Inf -Inf -Inf NaN 0 0 1
10 10 tslm NaN NaN NaN NaN NaN 1 Inf -Inf -Inf -Inf NaN 0 0 1
所以它为 df (>3000) 中的每一行数据创建了一个模型,所有这些数据都无法使用。
有人有提示吗?
附: 我是第一次吗
【问题讨论】:
对时间序列的每一行应用线性模型以使其只有一个观察值是没有意义的。您需要更多的观察来获得系数并应用预测。有时,当您对数据集进行分组并且您有嵌套的小标题或数据框时,是的,这可能是可行的。但在这里我认为情况并非如此。 是的,我不知道为什么要为每行应用 1 个模型,这不是我的意图,我做错了什么? 你是对的@AnoushiravanR,数据集充满了组!!! 但取消分组后问题仍然存在 【参考方案1】:我不知道为什么这不起作用。它必须在您的数据或 R 设置中。 这是一个应该如何工作的示例:
library(fable)
aq <- cbind(Date = as.Date(paste('2021', airquality$Month, airquality$Day, sep = '-')), airquality) |>
as_tsibble()
fit <- aq |> model(tslm = TSLM(Ozone ~ Solar.R + Wind + Temp))
report(fit)
Series: Ozone
Model: TSLM
Residuals:
Min 1Q Median 3Q Max
-40.485 -14.219 -3.551 10.097 95.619
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -64.34208 23.05472 -2.791 0.00623 **
Solar.R 0.05982 0.02319 2.580 0.01124 *
Wind -3.33359 0.65441 -5.094 1.52e-06 ***
Temp 1.65209 0.25353 6.516 2.42e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 21.18 on 107 degrees of freedom
Multiple R-squared: 0.6059, Adjusted R-squared: 0.5948
F-statistic: 54.83 on 3 and 107 DF, p-value: < 2.22e-16
【讨论】:
你是对的,这是错误的,即使像 ARIMA 这样的其他模型仍然每行提供 1 个模型以上是关于模型为每行创建 1 个模型的主要内容,如果未能解决你的问题,请参考以下文章