时间序列

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了时间序列相关的知识,希望对你有一定的参考价值。

参考技术A 该序列具有明显的趋势性,所以不是通常的平稳序列

比较奇怪的是,和书上的怎么不一样,而且acf绝对值不应该小于1?哪里算错了?我知道了,原来算法都是用:

算的,而不是:

结论就是,自相关图显示出明显的三角对称性, 这时具有单调趋势的非平稳序列的一种典型的自相关图形式.

跳过了

AR模型的自相关系数有俩个显著的性质: 1.拖尾性;2.指数衰减

滞后 阶的自相关系数的通解为:

其中 为差分方程的特征根, 为常数,且不全为0

通过这个通解形式,容易推出 始终有非零取值,不会在 大于某个常数之后就恒等于零,这个性质就是拖尾性.

而以指数衰减的性质就是利用自相关图判断平稳序列时所说的"短期相关"性质.

AR(p)模型的偏自相关系数具有 阶截尾性,利用线性方程组的理论可以证明.事实上,这也是一种确定阶数的方法.另外偏自相关系数可以通过求解Yule-Walker方程获得:

是不是又哪里搞错了,和库里的又不一样了.

MA(q)模型自相关系数 阶截尾,即 阶以后自相关系数为0

MA(q)模型偏自相关系数拖尾

ARMA(p, q)模型自相关系数不截尾,而且偏自相关系数也不截尾

<div>
<style scoped>
.dataframe tbody tr th:only-of-type
vertical-align: middle;


</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>output</th>
</tr>
</thead>
<tbody>
<tr>
<th>1964-12-31</th>
<td>97.0</td>
</tr>
<tr>
<th>1965-12-31</th>
<td>130.0</td>
</tr>
<tr>
<th>1966-12-31</th>
<td>156.5</td>
</tr>
<tr>
<th>1967-12-31</th>
<td>135.2</td>
</tr>
<tr>
<th>1968-12-31</th>
<td>137.7</td>
</tr>
<tr>
<th>1969-12-31</th>
<td>180.5</td>
</tr>
<tr>
<th>1970-12-31</th>
<td>205.2</td>
</tr>
<tr>
<th>1971-12-31</th>
<td>190.0</td>
</tr>
<tr>
<th>1972-12-31</th>
<td>188.6</td>
</tr>
<tr>
<th>1973-12-31</th>
<td>196.7</td>
</tr>
<tr>
<th>1974-12-31</th>
<td>180.3</td>
</tr>
<tr>
<th>1975-12-31</th>
<td>210.8</td>
</tr>
<tr>
<th>1976-12-31</th>
<td>196.0</td>
</tr>
<tr>
<th>1977-12-31</th>
<td>223.0</td>
</tr>
<tr>
<th>1978-12-31</th>
<td>238.2</td>
</tr>
<tr>
<th>1979-12-31</th>
<td>263.5</td>
</tr>
<tr>
<th>1980-12-31</th>
<td>292.6</td>
</tr>
<tr>
<th>1981-12-31</th>
<td>317.0</td>
</tr>
<tr>
<th>1982-12-31</th>
<td>335.4</td>
</tr>
<tr>
<th>1983-12-31</th>
<td>327.0</td>
</tr>
<tr>
<th>1984-12-31</th>
<td>321.9</td>
</tr>
<tr>
<th>1985-12-31</th>
<td>353.5</td>
</tr>
<tr>
<th>1986-12-31</th>
<td>397.8</td>
</tr>
<tr>
<th>1987-12-31</th>
<td>436.8</td>
</tr>
<tr>
<th>1988-12-31</th>
<td>465.7</td>
</tr>
<tr>
<th>1989-12-31</th>
<td>476.7</td>
</tr>
<tr>
<th>1990-12-31</th>
<td>462.6</td>
</tr>
<tr>
<th>1991-12-31</th>
<td>460.8</td>
</tr>
<tr>
<th>1992-12-31</th>
<td>501.8</td>
</tr>
<tr>
<th>1993-12-31</th>
<td>501.5</td>
</tr>
<tr>
<th>1994-12-31</th>
<td>489.5</td>
</tr>
<tr>
<th>1995-12-31</th>
<td>542.3</td>
</tr>
<tr>
<th>1996-12-31</th>
<td>512.2</td>
</tr>
<tr>
<th>1997-12-31</th>
<td>559.8</td>
</tr>
<tr>
<th>1998-12-31</th>
<td>542.0</td>
</tr>
<tr>
<th>1999-12-31</th>
<td>567.0</td>
</tr>
</tbody>
</table>
</div>

差分运算

就用个ARMA(1, 1, 4)吧

利用summary查看

<table class="simpletable">
<caption>ARIMA Model Results</caption>
<tr>
<th>Dep. Variable:</th> <td>D.output</td> <th> No. Observations: </th> <td>35</td>
</tr>
<tr>
<th>Model:</th> <td>ARIMA(0, 1, 4)</td> <th> Log Likelihood </th> <td>-156.722</td>
</tr>
<tr>
<th>Method:</th> <td>css-mle</td> <th> S.D. of innovations</th> <td>20.534</td>
</tr>
<tr>
<th>Date:</th> <td>Thu, 13 Jun 2019</td> <th> AIC </th> <td>325.444</td>
</tr>
<tr>
<th>Time:</th> <td>18:06:52</td> <th> BIC </th> <td>334.776</td>
</tr>
<tr>
<th>Sample:</th> <td>12-31-1965</td> <th> HQIC </th> <td>328.666</td>
</tr>
<tr>
<th></th> <td>- 12-31-1999</td> <th> </th> <td> </td>
</tr>
</table>
<table class="simpletable">
<tr>
<td></td> <th>coef</th> <th>std err</th> <th>z</th> <th>P>|z|</th> <th>[0.025</th> <th>0.975]</th>
</tr>
<tr>
<th>const</th> <td> 13.9682</td> <td> 0.726</td> <td> 19.227</td> <td> 0.000</td> <td> 12.544</td> <td> 15.392</td>
</tr>
<tr>
<th>ma.L1.D.output</th> <td> -0.3682</td> <td> 0.200</td> <td> -1.840</td> <td> 0.076</td> <td> -0.761</td> <td> 0.024</td>
</tr>
<tr>
<th>ma.L2.D.output</th> <td> -0.1066</td> <td> 0.182</td> <td> -0.585</td> <td> 0.563</td> <td> -0.463</td> <td> 0.250</td>
</tr>
<tr>
<th>ma.L3.D.output</th> <td> -0.3034</td> <td> 0.196</td> <td> -1.545</td> <td> 0.133</td> <td> -0.688</td> <td> 0.081</td>
</tr>
<tr>
<th>ma.L4.D.output</th> <td> -0.2218</td> <td> 0.176</td> <td> -1.262</td> <td> 0.217</td> <td> -0.566</td> <td> 0.123</td>
</tr>
</table>
<table class="simpletable">
<caption>Roots</caption>
<tr>
<td></td> <th> Real</th> <th> Imaginary</th> <th> Modulus</th> <th> Frequency</th>
</tr>
<tr>
<th>MA.1</th> <td> 1.0000</td> <td> -0.0000j</td> <td> 1.0000</td> <td> -0.0000</td>
</tr>
<tr>
<th>MA.2</th> <td> -0.1585</td> <td> -1.4742j</td> <td> 1.4827</td> <td> -0.2670</td>
</tr>
<tr>
<th>MA.3</th> <td> -0.1585</td> <td> +1.4742j</td> <td> 1.4827</td> <td> 0.2670</td>
</tr>
<tr>
<th>MA.4</th> <td> -2.0510</td> <td> -0.0000j</td> <td> 2.0510</td> <td> -0.5000</td>
</tr>
</table>

其中的ma.L1.D.output 表示模型的MA部分的第一个参数,因为我们的AR部分为0,如果存在的话也有ar.L1.D.output的

表示t检验,这里好像检验没通过,我也不知道咋怎.

大于0.05,所以模型是显著的

时间序列分类:LSTM模型之处理变序列长度输入

参考技术A 摘自: https://datascience.stackexchange.com/questions/48796/how-to-feed-lstm-with-different-input-array-sizes

The easiest way is to use  Padding and Masking .

There are three general ways to handle variable-length sequences:

(1)Padding and masking,

(2)Batch size = 1,

(3)Batch size > 1, with equi-length samples in each batch.

In this approach, we pad the shorter sequences with a special value to be masked (skipped) later. For example, suppose each timestamp has dimension 2, and -10 is the special value, then

will be converted to

This way, all sequences would have the same length. Then, we use a  Masking  layer that skips those special timestamps like they don't exist. A complete example is given at the end.

For cases (2) and (3) you need to set the seq_len of LSTM to None, e.g.

this way LSTM accepts batches with different lengths; although samples inside each batch must be the same length. Then, you need to feed a  custom batch generator  to  model.fit_generator  (instead of  model.fit ).

I have provided a complete example for simple case (2) (batch size = 1) at the end. Based on this example and the link, you should be able to build a generator for case (3) (batch size > 1). Specifically, we either (a) return batch_size sequences with the same length, or (b) select sequences with almost the same length, and pad the shorter ones the same as case (1), and use a Masking layer before LSTM layer to ignore the padded timestamps, e.g.

where first dimension of input_shape in Masking is again None to allow batches with different lengths.

Here is the code for cases (1) and (2):

Note that if we pad without masking, padded value will be regarded as actual value, thus, it becomes noise in data. For example, a padded temperature sequence [20, 21, 22, -10, -10] will be the same as a sensor report with two noisy (wrong) measurements at the end. Model may learn to ignore this noise completely or at least partially, but it is reasonable to clean the data first, i.e. use a mask.

以上是关于时间序列的主要内容,如果未能解决你的问题,请参考以下文章

时间序列分析:平稳时间序列分析之数据准备

(19)时间序列分析

时间序列分析概述

时间序列

90-预测分析-R语言实现-时间序列1

时间序列模型讲座回顾