如何在 R 中使用 Monte Carlo 进行 ARIMA 模拟函数

Posted 2023-02-15

技术标签:

【中文标题】如何在 R 中使用 Monte Carlo 进行 ARIMA 模拟函数【英文标题】：How to use Monte Carlo for ARIMA Simulation Function in R 【发布时间】：2021-01-21 03:06:58 【问题描述】：

这是我想用 R 做的算法：

ARIMA

arima.sim()

2s

3s

4s

5s

6s

7s

8s

9s

auto.arima()

ARIMA

RMSE

下面的R 函数可以完成这项工作。

## Load packages and prepare multicore process
library(forecast)
library(future.apply)
plan(multisession)
library(parallel)
library(foreach)
library(doParallel)
n_cores <- detectCores()
cl <- makeCluster(n_cores)
registerDoParallel(cores = detectCores())
## simulate ARIMA(1,0, 0)
#n=10; phi <- 0.6; order <- c(1, 0, 0)
bootstrap1 <- function(n, phi)
  ts <- arima.sim(n, model = list(ar=phi, order = c(1, 0, 0)), sd = 1)
  ########################################################
  ## create a vector of block sizes
  t <- length(ts)    # the length of the time series
  lb <- seq(n-2)+1   # vector of block sizes to be 1 < l < n (i.e to be between 1 and n exclusively)
  ########################################################
  ## This section create matrix to store block means
  BOOTSTRAP <- matrix(nrow = 1, ncol = length(lb))
  colnames(BOOTSTRAP) <-lb
  ########################################################
  ## This section use foreach function to do detail in the brace
  BOOTSTRAP <- foreach(b = 1:length(lb), .combine = 'cbind') %do%
    l <- lb[b]# block size at each instance 
    m <- ceiling(t / l)                                 # number of blocks
    blk <- split(ts, rep(1:m, each=l, length.out = t))  # divides the series into blocks
    ######################################################
    res<-sample(blk, replace=T, 10)        # resamples the blocks
    res.unlist <- unlist(res, use.names = FALSE)   # unlist the bootstrap series
    train <- head(res.unlist, round(length(res.unlist) - 10)) # Train set
    test <- tail(res.unlist, length(res.unlist) - length(train)) # Test set
    nfuture <- forecast::forecast(train, model = forecast::auto.arima(train), lambda=0, biasadj=TRUE, h = length(test))$mean        # makes the `forecast of test set
    RMSE <- Metrics::rmse(test, nfuture)      # RETURN RMSE
    BOOTSTRAP[b] <- RMSE
  
  BOOTSTRAPS <- matrix(BOOTSTRAP, nrow = 1, ncol = length(lb))
  colnames(BOOTSTRAPS) <- lb
  BOOTSTRAPS
  return(list(BOOTSTRAPS))

调用函数

bootstrap1(10, 0.6)

我得到以下结果：

##              2        3         4        5        6        7         8         9
##  [1,] 0.8920703 0.703974 0.6990448 0.714255 1.308236 0.809914 0.5315476 0.8175382

我想按时间顺序重复上面的step 1到step 4，然后想到R中的Monte Carlo技术。因此，我加载它的包并运行以下函数：

param_list=list("n"=10, "phi"=0.6)
library(MonteCarlo)
MC_result<-MonteCarlo(func = bootstrap1, nrep=3, param_list = param_list)

希望在matrix 表单中得到类似的结果：

##           [,2]     [,3]      [,4]    [,5]       [,6]      [,7]      [,8]      [,9]
##  [1,] 0.8920703 0.703974  0.6990448 0.714255  1.308236  0.809914  0.5315476 0.8175382
##  [2,] 0.8909836 0.8457537 1.095148  0.8918468 0.8913282 0.7894167 0.8911484 0.8694729
##  [3,] 1.586785  1.224003  1.375026  1.292847  1.437359  1.418744  1.550254  1.30784

但我收到以下错误消息：

蒙特卡洛错误（func = bootstrap1，nrep = 3，param_list = param_list）： func 必须返回一个包含命名组件的列表。每个组件都必须是标量。

我怎样才能找到获得上述预期结果并使结果可重现的方法？

【问题讨论】：

【参考方案1】：

您收到此错误消息是因为 MonteCarlo 期望 bootstrap1() 接受模拟的 one 参数组合，并且每次复制只返回 one 值 (RMSE) .这里不是这种情况，因为块长度 (lb) 由模拟时间序列的长度 (n) within bootstrap1 决定，因此您将获得 @987654328 的结果@ 每次调用的块长度。

一种解决方案是将块长度作为参数传递并适当地重写bootstrap1()：

library(MonteCarlo)
library(forecast)
library(Metrics)

# parameter grids
n <- 10 # length of time series
lb <- seq(n-2) + 1 # vector of block sizes
phi <- 0.6 # autoregressive parameter
reps <- 3 # monte carlo replications

# simulation function  
bootstrap1 <- function(n, lb, phi) 
    
    #### simulate ####
    ts <- arima.sim(n, model = list(ar = phi, order = c(1, 0, 0)), sd = 1)
    
    #### devide ####
    m <- ceiling(n / lb) # number of blocks
    blk <- split(ts, rep(1:m, each = lb, length.out = n)) # divide into blocks
    #### resample ####
    res <- sample(blk, replace = TRUE, 10)        # resamples the blocks
    res.unlist <- unlist(res, use.names = FALSE)   # unlist the bootstrap series
    #### train, forecast ####
    train <- head(res.unlist, round(length(res.unlist) - 10)) # train set
    test <- tail(res.unlist, length(res.unlist) - length(train)) # test set
    nfuture <- forecast(train, # forecast
                        model = auto.arima(train), 
                        lambda = 0, biasadj = TRUE, h = length(test))$mean    
    ### metric ####
    RMSE <- rmse(test, nfuture) # return RMSE
    return(
      list("RMSE" = RMSE)
    )


param_list = list("n" = n, "lb" = lb, "phi" = phi)

要运行模拟，请将参数以及bootstrap1() 传递给MonteCarlo()。要并行执行模拟，您需要通过ncpus 设置内核数。 MonteCarlo 包使用 snowFall，因此它应该在 Windows 上运行。

请注意，我还设置了raw = T（否则结果将是所有复制的平均值）。之前设置种子将使结果具有可重复性。

set.seed(123)
MC_result <- MonteCarlo(func = bootstrap1, 
                        nrep = reps,
                        ncpus = parallel::detectCores() - 1,
                        param_list = param_list,
                        export_also = list(
                         "packages" = c("forecast", "Metrics")
                        ),
                        raw = T)

结果是一个数组。我认为最好通过MakeFrame()将其转换为data.frame：

Frame <- MakeFrame(MC_result)

虽然很容易获得reps x lb 矩阵：

matrix(Frame$RMSE, ncol = length(lb), dimnames = list(1:reps, lb))

【讨论】：

我需要邀请您到一个房间讨论超出您已回答的问题范围的问题。我什至不知道如何发送邀请。我使用此链接chat.***.com/users?tab=all&sort=name 搜索了您的用户名，但找不到您。如果你能邀请我，我不介意。请加入我chat.***.com/rooms/222902/…

以上是关于如何在 R 中使用 Monte Carlo 进行 ARIMA 模拟函数的主要内容，如果未能解决你的问题，请参考以下文章

蒙特卡洛模拟(Monte Carlo Simulation)浅析

C++ Monte Carlo 集成：如何在不求和结果的情况下多次运行代码？

Monte Carlo仿真方法的基本思想及其特点

Monte Carlo Integration

蒙特卡罗法 (Monte Carlo Methods)

Monte-Carlo Dropout