# Time Series Analysis

by Chee Yee Lim

Posted on 2021-05-17

Collection of notes on time series analysis (statistical point of view).

## Time Series Analysis

### Overview

• A time series is a set of values observed sequentially through time.
• The next value in a time series is dependent on the previous values.
• The series may be denoted by $$X_1, X_2, ..., X_t$$ where $$t$$ refers to the time period and $$X$$ refers to the value.
• If future values of $$X$$ are exactly determined by a methematical formula, the series is said to be deterministic.
• If future values of $$X$$ are described only by their probability distribution, the series is said to be a statistical or stochastic process.
• A special class of stochastic processes is a stationary stochastic process, which is required for Box-Jenkins ARIMA models.
• A time series can be made up of 3 key components.
• Trend - a long term increase or decrease.
• Seasonality - an effect of seasonal factors for a fixed or known period.
• Cycle - a periodic cycle that is not of fixed or known period.
• When analysing time series, it is important to check autocorrelation among the data points.
• This can be done by performing the autocorrelation function (ACF) plot analysis and partial autocorrelation function (PACF) plot analysis.
• ACF shows a bar chart between correlation coefficients and lags. PACF shows a bar chart between partial correlation coefficients and lags.
• It is a primary method for finding optimal time steps for ARIMA type of models.
• When generating training and test sets, it is important to separate data points in chunks and not completely randomly like in non-time series data.

### Independence and Stationarity

• Independence is a general concept, while stationarity is a special case of independence.
• The key difference with a time series data is that each data point is not independent, i.e. the value of current data point depends on the values of previous data point.
• In the case of non-stationarity, time steps correlate with mean and/or variance.
• A statistical process is stationary if the probability distribution is the same for all starting values of $$t$$.
• This implies that the mean and variance are constant for all values of $$t$$.
• A series that exhibits a simple trend is not stationary because the values of the series depend on $$t$$.
• A stationary stochastic process is completely defined by its mean, variance and autocorrelation function.
• Stationarity is a required assumption for ARIMA models, but not necessarily for other models.

### List of Forecast Algorithms

• Average/mean method
• All future values are equal to the average of the historical data.
• $$\hat{y}_{T+h} = \bar{y} = \frac{( y_1 + ... + y_T )}{T}$$
• Prediction interval for $$h$$-step:
• $$\hat{\sigma}_h = \hat{\sigma} \sqrt{1 + \frac{1}{T}}$$, where $$\hat{\sigma}$$ is the residual standard deviation.
• Naive method
• All forecasts are set to be the value of the last observation.
• $$\hat{y}_{T+h} = y_T$$
• This method works remarkably well for economic and financial time series.
• Because a naive forecast is optimal when data follow a random walk, it is also called a random walk forecast.
• Prediction interval for $$h$$-step:
• $$\hat{\sigma}_h = \hat{\sigma} \sqrt{h}$$, where $$\hat{\sigma}$$ is the residual standard deviation.
• Seasonal naive method (snaive)
• Each forecast is set to be the value of the last observed value from the same season.
• $$\hat{y}{T+h} = y$$, where $$m$$ is the seasonal period, $$k$$ is the integer part of $$\frac{(h-1)}{m}$$ (i.e. the number of complete years in the forecast period prior to time $$T + h$$).
• This method is useful for highly seasonal data.
• Prediction interval for $$h$$-step:
• $$\hat{\sigma}_h = \hat{\sigma} \sqrt{k + 1}$$, where $$\hat{\sigma}$$ is the residual standard deviation and $$k$$ is the integer part of $$\frac{(h-1)}{m}$$
• Drift method
• Each forecast is obtained by extrapolating from a line fitted on the first and last observations.
• $$\hat{y}_{T+h}$$
• $$= y_T + \frac{h}{T-1} \sum_{t=2}^{T} (y_t - y_{t-1})$$
• $$= y_T + h ( \frac{y_T - y_1}{T - 1} )$$
• Prediction interval for $$h$$-step:
• $$\hat{\sigma}_h = \hat{\sigma} \sqrt{ h(1 + \frac{h}{T}) }$$, where $$\hat{\sigma}$$ is the residual standard deviation.
• Regular linear model (no time lag between predictors and response)
• A regular linear model can be used with time series data, in which $$x_t$$ observed at time point $$t$$ is used to predict $$y_t$$ observed at time point $$t$$.
• $$y_t = \beta_0 + \beta_1 x_{1,t} + \beta_2 x_{2,t} + ... + \beta_k x_{k,t} + \epsilon_t$$
• i.e. predictors at time $$t$$ are used to predict response at time $$t$$.
• Using the no time lag approach, forecast values for $$x_{n,t}$$ are needed to predict $$y_t$$.
• For scenario-based forecast, this approach is very useful where there are multiple sets of forecast values for $$x_{n,t}$$ derived based on assumptions.
• E.g. under best case scenario, we can assume +1% income growth rate and +0.5% savings growth rate for predicting change in consumption. Under worst case scenario, we can assume -1% income decline rate and -0.5% savings decline rate for predicting change in consumption.
• However, the no time lag approach is not suitable for forecasting into the future without assumed values for $$x_{n,t}$$.
• Lagged values are required in that case. For example to predict $$y_{t+h}$$ with $$x_{n,t}$$, $$y_{t+h} = \beta_0 + \beta_1 x_{1,t} + \beta_2 x_{2,t} + ... + \beta_k x_{k,t} + \epsilon_{t+h}$$ where $$h = 1,2,...$$
• Assumptions on this approach:
• Linear relationship between target and predictor variables.
• Errors have zero mean; otherwise forecasts are systematically biased.
• Errors are not autocorrelated; otherwise forecasts are systematically biased.
• Errors are unrelated to predictor variables; otherwise there would be more information that should be included in the systematic part of the model.
• Useful to have the errors normally distributed with a constant variance, in order to easily produce prediction intervals.
• The model can be fitted with least squares estimation, by minimising the following equation:
• $$\sum_{t=1}^T \epsilon_{t}^2 = \sum_{t=1}^T ( y_t - \beta_0 - \beta_1 x_{1,t} - \beta_2 x_{2,t} - ... - \beta_k x_{k,t} )^2$$
• The goodness-of-fit for the model can be evaluated via:
1. Coefficient of determination, $$R^2$$
2. Residual standard error (i.e. also useful for prediction interval calculation)
3. ACF plot of residuals
• It is common to find autocorrelation in the residuals from a model fitted to time series data.
• This will violate the assumption of no autocorrelation in the errors of our model, and our forecasts may be inefficient - there is some information left over which should be accounted for in the model to obtain better forecasts.
• The forecasts from a model with autocorrelated errors are still unbiased, and so are not "wrong", but they will usually have larger prediction intervals than they need to.