Time Series Analysis Part 3 (Machine Learning)

We already have seen in the previous section how Time Series Modeling with the help of ARIMA can be realized.

In the last few years, there have been more attempts at a fresh approach to statistical time-series forecasting using the increasingly accessible tools of machine learning. This means methods like neural networks and extreme gradient boosting, as supplements or even replacements of the more traditional tools like auto-regressive integrated moving average (ARIMA) models.

An alternative to ARIMA is the usage of generic machine learning algorithms. The advantage of this approach is the ability to use multiple parameters, thus creating more complex models. Our intuition says that a more complex model should produce better results (although in reality, this is not always the case). The best results are usually provided by Gradient Boosting Machines (GBM), and Random Forests but other algorithms can also be tried.

The disadvantage is that these algorithms are general purpose ones and Time-series data is a different beast. Hence, the way to create the model is a little different than usual ML modeling, requires more time, and attention is required.

We should remember all the times that we work with time-series data, so the data sets are consecutive in time, and should not be randomly sampled.

When dealing with time-series data we need to keep thinking what data was available at what point in time, and remember that the test data represents unseen data.

We should be careful when selecting what will be our training data, what will be the validation data, and what will be the test data. Take a moment (or two) to ponder the differences, for time-series data. Generally, with time-series data, we want to train on the old data and test on the new data, while normally in machine learning we would have selected random sampling or fine-grained stratification (e.g. every fifth match as test data). The same thought process applies to validation data, so by keeping training data before validation data before test data, we have the best chance of being kept honest.

Then, some kind of data engineering is required, during data preparation, to model the relationship between one-time point to the next. For example how the stock price today affects the price tomorrow, but at the same time the stock price, let’s say five years ago, is (almost) irrelevant for tomorrow. Or for the power demand example, today is influencing tomorrow’s power demand, three months ago, not so much, but the same day last year is affecting it due to periodicity! Usually, when learning from a time series we will generate additional features to represent these relationships by adding moving average columns.

In the world of finance, and in particular the area known as technical trading, moving averages are everywhere. The SMA (simple moving average) is the most commonly used, but the EMA (exponential moving average — full data history is used, but the most recent values have most weight) is also popular and there are a whole host of other weighted and smoothed moving averages.

As a rule of thumb: use an SMA unless you can come up with a good reason not to.

There is also the concept of the moving standard deviation. A current value can then be measured in the number of standard deviations it is above or below the moving average over some time span.

Another concept used a lot in finance is crossover. One type is when a moving average crosses from above or below the current price; another is when moving averages of different periods cross (e.g., a 20-day moving average crosses a 100-day moving average). In both cases, they suggest to a trader that a trend has ended, and the price direction will reverse.

There is no golden rule upon how to do the data preparation. It depends on the use case.

We could think of some more complicated transformation of our data in order to try to imitate what the time-series models do, but then, why try to re-invent the wheel..?

This is an area where research is very intensive, so perhaps in the future, we might have a better methodology available.  Currently, though it is as much art as engineering or science… and our favorite approach is that of try and error.

Ideally, our machine learning model will work these concepts out for itself. But such an ideal world often also needs near-infinite data samples, CPU, and memory. So, when dealing with time-series data, keep these concepts in your toolbox; think of them as hints you can give your learning models.

This Blog is created and maintained by Iraklis Mardiris


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s