Time Series and Forecasting
R has extensive facilities for analyzing time series data. This section describes the creation of a time series, seasonal decomposition, modeling with exponential and ARIMA models, and forecasting with the forecast package.
Creating a time series
The ts() function will convert a numeric vector into an R time series object. The format is ts(vector, start=, end=, frequency=) where start and end are the times of the first and last observation and frequency is the number of observations per unit time (1=annual, 4=quartly, 12=monthly, etc.).
# save a numeric vector containing 72 monthly observations
# from Jan 2009 to Dec 2014 as a time series object
myts <- ts(myvector, start=c(2009, 1), end=c(2014, 12), frequency=12)
# subset the time series (June 2014 to December 2014)
myts2 <- window(myts, start=c(2014, 6), end=c(2014, 12))
# plot series
A time series with additive trend, seasonal, and irregular components can be decomposed using the stl() function. Note that a series with multiplicative effects can often by transformed into series with additive effects through a log transformation (i.e., newts <- log(myts)).
# Seasonal decomposition
fit <- stl(myts, s.window="period")
# additional plots
Both the HoltWinters() function in the base installation, and the ets() function in the forecast package, can be used to fit exponential models.
# simple exponential - models level
fit <- HoltWinters(myts, beta=FALSE, gamma=FALSE)
# double exponential - models level and trend
fit <- HoltWinters(myts, gamma=FALSE)
# triple exponential - models level, trend, and seasonal components
fit <- HoltWinters(myts)
# predictive accuracy
# predict next three future values
The arima() function can be used to fit an autoregressive integrated moving averages model. Other useful functions include:
|lag(ts, k)||lagged version of time series, shifted back k observations|
|diff(ts, differences=d)||difference the time series d times|
|ndiffs(ts)||Number of differences required to achieve stationarity (from the forecast package)|
|pacf(ts)||partial autocorrelation function|
|adf.test(ts)||Augemented Dickey-Fuller test. Rejecting the null hypothesis suggests that a time series is stationary (from the tseries package)|
|Box.test(x, type="Ljung-Box")||Pormanteau test that observations in vector or time series x are independent|
Note that the forecast package has somewhat nicer versions of acf() and pacf() called Acf() and Pacf() respectively.
# fit an ARIMA model of order P, D, Q
fit <- arima(myts, order=c(p, d, q)
# predictive accuracy
# predict next 5 observations
The forecast package provides functions for the automatic selection of exponential and ARIMA models. The ets() function supports both additive and multiplicative models. The auto.arima() function can handle both seasonal and nonseasonal ARIMA models. Models are chosen to maximize one of several fit criteria.
# Automated forecasting using an exponential model
fit <- ets(myts)
# Automated forecasting using an ARIMA model
fit <- auto.arima(myts)
There are many good online resources for learning time series analysis with R. These include A little book of R for time series by Avril Chohlan and DataCamp's manipulating time series in R course by Jeffrey Ryan.