‘Time’ is the most important factor which ensures success in a business. It’s difficult to keep up with the pace of time. But, technology has developed some powerful methods to enable us to ‘see things’ ahead of time. One such method, which deals with time-based data is **Time Series Modeling**. As the name suggests, it involves working on time (years, days, hours, minutes) based data, to derive hidden insights to make informed decision making.

Time series models are very useful models when we have serially correlated data.

The most commonly used framework for time series modelling is based on the **Box–Jenkins method** named after the statisticians George Box and Gwilym Jenkins, and applies autoregressive moving average (ARMA) or autoregressive integrated moving average (ARIMA) models to find the best fit of a time-series model to past values of a time series. The framework (shown below) specifies the step by step approach on ‘**How to do a Time Series Analysis**‘:

**Step 1: Visualize and Examine the Time Series**

It is essential to analyze the trends prior to building any kind of time series model. The details we are interested in pertains to any kind of trend, seasonality or random behavior in the series.

Now is the time to clean up any outliers and handle missing values. We could also use a logarithm of a series to help stabilize a strong trend.

**Step 2: Stationarize the Series**

Once we know the patterns, trends, cycles, and seasonality, we can check if the series is stationary or not. Dickey – Fuller is one of the popular tests to check the same.

Stationarity is important because we can use our prediction model only on stationary time series. Stationary time series have no trend. The second order stationarity conditions are:

- Constant Mean
- Constant Variance
- An autocovariance that does not depend on time

One thing we can do to check for stationarity conditions is to create a boxplot for each month or so (let’s assume that our time series is reported daily), and according to the stationarity conditions, we should be asking:

- Does the mean change over time?
- Does the variance change over time?

So these boxplots look pretty much the same from month to month. We would expect and accept some variation even for truly stationary signals. So this is not too bad. The numbers here are pretty small. The variation between these boxplots is a little big but it’s not huge. Here we binned by month. 30 days (or measurements) is enough to get a pretty good handle on the mean and variance of the signal within that month, so that’s ok.

Another way to check for stationarity is to see if we can find a trend. If we can find a trend for the mean that is nonzero then we probably don’t have a stationary signal:

But what if the series is found to be non-stationary?

There are some commonly used techniques to make a time series stationary, like:

- Detrending
- Differencing
- Seasonality

Differencing is a powerful tool allowing us to use techniques for stationary time series to model nonstationary series and makes higher order trends turn into stationary models through repeated differencing.

At the, end the time series can be decomposed to:

- Data
- Seasonal
- Trend
- Remainder

**Step 3: Find Optimal Parameters for the ARIMA model**

The ARIMA model will be used for our forecasting. It integrates two powerful models, the Auto-Regressive and Moving Average model and it requires three parameters: **p,d,q**.

A good way to think about it, is (AR, I, MA), where AR stands for Auto-Regressive model, MA for Moving Average and I for Integration.

An** auto regressive (AR(p))** component is referring to the use of past values in the regression equation for the series *Y*. The auto-regressive parameter *p *specifies the number of lags used in the model.

The *d* represents the degree of differencing in the **integrated (***I(d)***)** component. Differencing a series involves simply subtracting its current and previous values *d *times. Often, differencing is used to stabilize the series when the stationarity assumption is not met.

A **moving average (MA(q))** component represents the error of the model as a combination of previous error terms *e _{t}*. The order

*q*determines the number of terms to include in the model.

ARIMA(1, 0, 2) means that we are describing some response variable (Y) by combining a 1st order Auto-Regressive model and a 2nd order Moving Average model.

Y = (Auto-Regressive Parameters) + (Moving Average Parameters)

The 0 in the between the 1 and the 2 represents the ‘I’ part of the model (the Integrative part) and it signifies a model where we are taking the difference between response variable data.

We can use the ACF and PACF plots to find the parameters. If both ACF and PACF decreases gradually, it indicates that we need to make the time series stationary and introduce a value to “d”.

Let’s see an example:

We see that it has an exponential decay, so fine, we think it’s an AR model. But we don’t know what order it is.

But if we look at the partial autocorrelation function:

then it tells us a lot. In fact, it tells us exactly what we need to know – that this is an AR(1) model. There’s one term in the model because there’s one nonzero term in the partial autocorrelation function.

AR models are not always stationary. It depends on the parameters. The autocorrelation function for stationary AR(p) models has exponential decay and the partial autocorrelation for stationary AR(p) models has a strict cutoff at (p).

If our data arise from an MA model of order q, the autocorrelation function will drop sharply after q.

so in practice, if we have a time series, we can tell if our data arise from an MA model. We can plot an autocorrelation function and if it drops off sharply, like this one, we should be able to model it as an MA model of a particular order.

Furthermore, we can try Akaike’s Information Criterion (AIC) on a set of models and investigate the models with the lowest AIC values and the Schwartz Bayesian Information Criterion (BIC) and investigate the models with the lowest BIC values.

*All of the ways to find p, d, q listed here can be found in the R package TSA, if you are familiar with R.*

**Step 4: Build ARIMA Model**

ARIMA is Auto-Regressive Integrated Moving Average. The way it’s defined is that it’s series when we take the difference series out and we get an ARMA model. Even if we have to take the difference of the difference of the difference and do that a few more times. If we eventually get to an ARMA model, then the original series is an ARIMA model.

With the parameters in hand, from the previous step, we can now try to build the ARIMA model.

In reality, the ACF and PACF plots provide insights of the correct parameters, but not always the best fitting ones. The practical way to do it is that of try and error. We need to start from these parameters and explore more (p,d,q) combinations. The one with the lowest BIC and AIC should be our choice. We can also try some models with a seasonal component, just in case, we notice any seasonality. We then apply the parameters and observe the resulting coefficients, then we fine-tune and try again.

For example, we might try (1,1,2) and the result is:

Which does not look very good, the standard error in ma2 is 0.0854 which is now about the size of the above coefficient. We expect something way lower.

We can try order 2 for the autoregressive component (2,1,1):

And again, the same kind of (bad) result, this time under ar2.

In contrast, an ARIMA model (1,1,1) seems to make more sense here:

In some occasions two or three tries are enough, some other times a grid search (letting the computer try different multiple sets of parameters) is advised.

**Step 5: Make Predictions**

So we built all these models, we have decomposed the time series, but what we really want to do is forecast.

Once we have the final ARIMA model, we are now ready to make predictions on the future time points. We can also visualize the trends to cross validate if the model works fine.

We apply the model and we get our forecasts. In the example below the time series ends in December 2013 and the forecast for the year 2014 was asked:

Notice that there is not only the forecast (under the forecast column) provided but there is also low and high confidence intervals, 80% and 95%. We can notice that as we move further into the future, the confidence interval gets wider.

We can also plot our forecast:

In black, we can see the historical time series and in blue the forecast. We notice that the shape looks pretty similar to these other years. In dark grey, the 80% confidence interval is plotted and in light grey the 95% confidence interval. Thus, we now have graphically not only the forecast for the next 12 months but also plots of the 80 and 95% confidence intervals.

Consequently, by using ARIMA we can create univariate time series forecasts and learn quite a bit, because we also get the confidence intervals back. We made a plot for 2014 that looks quite plausible, so if we were in December 2013, we would have some more information for going forward to 2014.

*This Blog is created and maintained by Iraklis Mardiris*

Pingback: Time Series Analysis Part 3 (Machine Learning) | IT Skunk Works