111 lines
3.4 KiB
Plaintext
111 lines
3.4 KiB
Plaintext
---
|
||
title: "Lab10: Time Series"
|
||
author: "Vladislav Litvinov <vlad@sek1ro>"
|
||
output:
|
||
pdf_document:
|
||
toc_float: TRUE
|
||
---
|
||
Plotting data set
|
||
```{r}
|
||
setwd('/home/sek1ro/git/public/lab/ds/25-1/r')
|
||
jj = scan("jj.dat")
|
||
jj_ts = ts(jj, start = c(1960, 1), frequency = 4)
|
||
|
||
jj_ts
|
||
|
||
plot(jj_ts, ylab = "EPS", xlab = "Year")
|
||
```
|
||
In order to perform an ARIMA model, the time series will need to be transformed to remove any trend. Plot the difference of xt and xt-1, for all t > 0. Has this difference adequately detrended the series? Does the variability of the EPS appear constant over time? Why does the constant variance matter?
|
||
```{r}
|
||
jj_diff = diff(jj_ts)
|
||
|
||
plot(jj_diff, xlab = "Year", ylab = "EPS diff")
|
||
```
|
||
Plot the log10 of the quarterly EPS vs. time and plot the difference of log10(xt ) and
|
||
log10(xt-1) for all t > 0. Has this adequately detrended the series? Has the variability of the differenced log10(EPS) become more constant?
|
||
```{r}
|
||
log_jj = log10(jj_ts)
|
||
log_jj_diff = diff(log_jj)
|
||
|
||
plot(log_jj, xlab = "Year", ylab = "log10(EPS)")
|
||
plot(log_jj_diff, xlab = "Year", ylab = "log10(EPS) diff")
|
||
```
|
||
Treating the differenced log10 of the EPS series as a stationary series, plot the ACF and PACF of this series. What possible ARIMA models would you consider and why?
|
||
|
||
ACF(k) = Corr(x[t], x[t-k]) - Autocorrelation Function, показывает, насколько временной ряд коррелирует сам с собой
|
||
|
||
PACF - Partial Autocorrelation Function - оказывает чистую связь после удаления влияния всех промежуточных значений между t и t-k, это последний коэффициент в AR(k)-регрессии
|
||
|
||
xt = f1xt-1 + f2xt-2 + .. + fkxt-k + eps
|
||
PACF(k) = fk
|
||
|
||
ARMA(p, q)
|
||
p - AR-часть, предыдущие значения - PACF
|
||
q - MA-часть, ошибки предыдущих предсказаний - ACF
|
||
|
||
ARIMA(p, d, q)
|
||
d - I-часть, number of differences
|
||
|
||
```{r}
|
||
acf(log_jj_diff, lag.max = 20)
|
||
ar(log_jj_diff)
|
||
pacf(log_jj_diff, lag.max = 20)
|
||
```
|
||
Run the proposed ARIMA models from part d and compare the results. Identify an appropriate model. Justify your choice.
|
||
|
||
Смысл: баланс между
|
||
|
||
Качеством подгонки (чем лучше модель описывает данные, тем ниже ошибка)
|
||
|
||
Сложностью модели (чем больше параметров, тем выше риск переобучения)
|
||
|
||
AIC=2k−2ln(L)
|
||
L > , k <
|
||
|
||
Why is the choice of natural log or log base 10 in Problem 4.8 somewhat irrelevant to the transformation and the analysis?
|
||
|
||
Why is the value of the ACF for lag 0 equal to one?
|
||
```{r}
|
||
library(forecast)
|
||
|
||
fit_model = function(order) {
|
||
Arima(log_jj_diff, order = order)
|
||
}
|
||
|
||
models <- list(
|
||
"1, 0, 1" = fit_model(c(1,0,1)),
|
||
"1, 1, 1" = fit_model(c(1,1,1)),
|
||
"1, 0, 5" = fit_model(c(1,0,5)),
|
||
"1, 1, 5" = fit_model(c(1,1,5))
|
||
)
|
||
|
||
print(models["1, 0, 5"])
|
||
|
||
aic_values <- sapply(models, AIC)
|
||
print(aic_values)
|
||
```
|
||
Arima(1, 0, 5)
|
||
```{r}
|
||
n = 10000
|
||
phi4 = c(-0.18)
|
||
AR <- arima.sim(n=n, list(ar=phi4[1]))
|
||
|
||
plot(AR, main="AR series")
|
||
acf(AR, main="ACF AR")
|
||
pacf(AR, main="PACF AR")
|
||
|
||
theta4 <- c(-0.65, -0.22, -0.28, 1, -0.4)
|
||
MA <- arima.sim(n=n, list(ma=theta4))
|
||
|
||
plot(MA, main="MA series")
|
||
acf(MA, main="ACF MA")
|
||
pacf(MA, main="PACF MA")
|
||
```
|
||
|
||
```{r}
|
||
fit <- auto.arima(jj_ts)
|
||
summary(fit)
|
||
|
||
forecasted_values <- forecast(fit, h = 20)
|
||
plot(forecasted_values)
|
||
``` |