Machine Learning – KNN using Tradesignal

I always thought that inspiration and experience is a key factor in trading. But everytime my chess computer beats me without any inspiration, just by brute force, I start to reconsider this assumption. This article will be about a brute force approach in trading.

Rule based trading

I have never been a great believer in classical technical analysis. If you ask 2 analysts about the current trend in the market, you get at least 3 answers. So I turned to algorithmic trading, using the tools of technical analysis in a new way, doing if..then conditions, backtesting them, refining the rules and parameters until the desired result was shown. These if..then based conditions, like if the market is above it`s 200 day line then go long, are mostly found by experience and inspiration. Isn’t my brain just a neural network which can be trained with historic data (experience), enhanced with a glass of wine for the inspiration?

Today I would like to take a voyage into machine learning. I would like to let my computer find the rules by itself, and just decide if I like the results or not. I’ll have the glass of wine with some friends and let the machine do the job; This sounds tempting to me, but can life really be as easy?

Unsupervised machine learning – kNN algorithm

The knn algorithm is one of the most simple machine learning algorithms. Learning might be the wrong label, in reality it is more of a classification algorithm. But first let’s see how it works.

The scatter chart above is a visualization of a two dimensional kNN data set. For this article I used a long term and a short term RSI. The dots represent the historic RSI values. Have a look at the fat circled point. It just means, that todays RSI1 has a value of 63, and RSI2 got a value of 70. Additionally the dots have got colours. A green dot means the market moved up on the following day, a red dot shows a falling market on the day after.

kNN – k nearest neighbours

To do a prediction of tomorrow’s market move, the kNN algorithm has a look at the historic data (shown on the scatter plot) and finds the k nearest neighbours of today’s RSI values. As you can see, our current fat point is surrounded by red dots. This means, that every time the 2 RSI values have been in this area, the market fell on the day after. That’s why today’s data point is classified as red. Call it classification or prediction, the kNN algorithm just has a look on what has happened in the past when the RSI indicators had a similar level. Have a look at this video, it is a great explanation on how the algorithm works.

kNN in Tradesignal

Computer kiddies would implement this algorithm in Python or R, but I would like to show you an implementation with the Tradesignal programming language Equilla. It is way faster than Python, and has got the advantage that I can directly see all the advantages and disadvantages on the chart. It is not just number crunching.

To implement the algorithm in Tradesignal we first have to do the shown scatter plot, but not graphically but store the 2 rsi values and the next days market move(colour of dots) into an array.

In line 8&9 the rsi values are calculated, line 12&13 calculates the next day`s market move. Line 15 to 20 then stores the data into the training data array. This is done for the first half of the data set, for my example I will use the data from bar 50 to 1000 on my chart of 2000 data points.

The next task to complete is to calculate the distances of today’s rsi point to all the historic points in the training data set.

Line 23 to 27 calculates the euclidean distance of today’s point to all historic points, line 29 then creates a sorted list of all these distances to find the k nearest historic data points in the training data set.

We are nearly done. The next step is just to find out what classification (colour) the nearest points have got and use this information to create a prediction for tomorrow. This is done in lines 33 to 35

Have a look at the scatter chart at the beginning. If this would be the data stored in our training data set, the prediction, using the 5 nearest neighbours, would be -5. All the 5 nearest neighbours of our current data point are red.

Now that we got a prediction for tomorrow, we just have to trade it:

kNN algorithm performance

Lets have a look if this simple machine learning algorithm works. Using 2000 days of backward adjusted brent data, I used a 14 and 28 day RSI to predict the next day move. The training was done on bar 50 to 1000, and I used the 5 nearest neighbours for the classification.

Knn algorithm – conclusion

Judging on the shown graph it seems to work. It seems to be possible to use these 2 RSI indicators to predict tomorrow’s brent move.

kNN algorithm gives me a framework to test all kind of indicators or even different data sets easily and see if they have got any predictive value.

This is definitely an addition to classical algorithmic trading, using if..then conditions build from experience and intuition.

But you might still need some intuition to find the right data sets, indicators and parameters to get a useful prediction. Not everything can be done by machine learning…




Using Autocorrelation for phase detection

Autocorrelation is the correlation of the market with a delayed copy of itself. Usually calculated for a one day time-shift, it is a valuable indicator of the trendiness of the market.

If today is up and tomorrow is also up this would constitute a positive autocorrelation. If tomorrows market move is always in the opposite of today’s direction, the autocorrelation would be negative.

Autocorrelation and trendiness of markets

If autocorrelation is high it just means that yesterdays market direction is basically today’s market direction. And if the market has got the same direction every day we can call it a trend. The opposite would be true in a sideway market. Without an existing trend today’s direction will most probably not be tomorrows direction, thus we can speak about a sideway market.

Autocorrelation in German Power

But best to have a look at a chart. It shows a backward adjusted daily time series of German Power.

The indicator shows the close to close autocorrelation coefficient, calculated over 250 days. You will notice that it is always fluctuating around the zero line, never reaching +1 or -1, but let`s see if we can design a profitable trading strategy even with this little bit of autocorrelation.

The direction of autocorrelation

Waiting for an autocorrelation of +1 would be useless. There will never be the perfect trend in real world data. My working hypothesis is, that a rising autocorrelation means that the market is getting trendy, thus a rising autocorrelation would be the perfect environment for a trend following strategy. But first we have to define the direction of the autocorrelation:

To define the direction of the autocorrelation I am using my digital stochastic indicator, calculated over half of the period I calculated the autocorrelation. Digital stochastic has the big advantage that it is a quite smooth indicator without a lot of lag, thus making it easy to define its direction. The definition of a trending environment would just be: Trending market if digital stochastic is above it`s yesterdays value.

Putting autocorrelation phase detection to a test

The most simple trend following strategy I can think about is a moving average crossover strategy. It never works in reality, simply as markets are not trending all the time. But combined with the autocorrelation phase detection, it might have an edge.

Wooha! That`s pretty cool for such a simple strategy. It is trading (long/short) if the market is trending, but does nothing if the market is in a sideway phase. Exactly what I like when using a trend following strategy.

To compare it with the original moving average crossover strategy, the one without the autocorrelation phase detection, you will see the advantage of the autocorrelation phase filter immediately: The equity line is way more volatile than the filtered one and you got lots of drawdowns when the market is sideways.

Stability of parameters

German power has been a quite trendy market over the last years, that`s why even the unfiltered version of this simple trend following strategy shows a positive result, but let`s have a test on the period of the moving average.

Therefore I calculated the return on account of both strategies, the unfiltered and the autocorrelation filtered, for moving average lengths from 3 to 75 days.

Return on account (ROA) =100 if your max drawdown is as big as your return.

The left chart shows the autocorrelation filtered ROA, the right side the straight ahead moving average crossover strategy. You don`t have to be a genius to see the advantage of the autocorrelation filter. Whatever length of moving average you select, you will get a positive result. This stability of parameters can not be seen with the unfiltered strategy.

Autocorrelation conclusion:

Trend following strategies are easy to trade, but only make sense when the market is trending. As shown with the tests above, autocorrelation seems to be a nice way to find out if the market is in the right phase to apply a trend following strategy.


Measuring your EDGE in algorithmic trading

There are a lot of statistics which can be used to describe algorithmic trading strategies returns. Risk reward ratio, profit factor, Sharpe ratio, standard deviation of returns… These are great statistics, but they miss an important factor: Are your returns statistically significant or just a collection of lucky noise. The EDGE statistic might me the answer to this question.


Statistics in trading:

If the returns of your trading strategy are positive with in-sample and out-of-sample data this is a first sign that you are on the right path. The next step would be to have a look at the risk-reward ratio of your trading to get an impression if the strategy might be useful in a real world environment.

Assuming that your average yearly returns are about twice as big as the worst case historic draw down you can even be more confident that your strategy is useful. But there is still one thing to check before you can be sure that you are not just seeing a curve fit bullshit strategy. The standard deviation of the daily returns vs. your average daily return.

Defining EDGE

Assume your strategy made 250$ over the last year. This averages to about 1$ per day. This 1$ is a good or bad return, depending on the standard deviation of your equity line. If the standard deviation of your equity is 2$, then the 1$ average return strategy would be a bad strategy, as your average returns are way too small in respect to the volatility of your equity. If your volatility of your return curve would just be 50ct and you still make 1$ per day on average, your strategy would be ingenious.

Edge is the ratio of your average returns vs the volatility of your equity line. To be on the safe side,  your average return should be about 5% above the 90% confidence interval of your equity line volatility.

The left chart is a strategy trading an one month RBOB time spread, the right chart shows the same strategy trading German power. Rbob has got an edge of 3%, German power has got an edge of 5%.

If I would have to select which market I want to trade with this sample strategy, I surely would select German power over the rbob time spread. Both curves have their up and downs, but rbob is heavily relying on a lucky trade in September. This lead to a high standard deviation of the equity line , giving you a low edge reading.


Observing the ration between your average daily returns vs. the volatility of your equity curve can give you some valuable insights in the quality of your strategy. If it just called a few lucky trades in history, it will also show a high volatility in returns. And this you most probably want to avoid when turning to algorithmic trading. It`s not just the absolute profit at the end of the year, it is also the path you took to get to this number. The smoother, the better!

[Equilla / Easy Language code for EDGE indicator]


Ranking: percent performance and volatility

When ranking a market analysts usually pick the percent performance since a given date as their key figure. If a stock has been at 100 last year and trades at 150 today, percent performance would show you a 50% gain (A). If another stock would only give a 30% gain (B), most people now would draw the conclusion that stock A would have been the better investment. But does this reflect reality?

Percent Performance and Volatility

In reality and as a trader I would never just buy and hold my position, I would always adjust my position size somehow related to the risk in it. I like instruments that rise smoothly, not the roller coaster ones which will only ruin my nerves. So ranking a market solely by percent performance is an useless statistic for me.

Lets continue with our example from above: if stock A, the one who made 50% has had a 10% volatility, and stock B, the 30% gainer, only had a 5% volatility, I surely would like to see stock B on top of my ranking list, and not the high vola but also high gain stock A.

Risking the same amount of money would have given me a bigger win with stock B.

Combining Performance and Volatility

To get stock B up in my ranking list I will have to combine the absolute gain with the market volatility in between. This can be done quite simple. Just add up the daily changes of the stock, normalized by market volatility.Have a look at the formula of this new indicator:

index(today)=index(yesterday)+(price(today)-price(yesterday))/(1.95*stdev(price(yesterday)-price(2 days ago),21))

In plain English: Today’s Vola Return Index equals yesterdays Vola Return Index plus the daily gain normalized by volatility

So if the index has been at 100, the volatility (as a 95% confidence interval over 21 days) is 1 and the stock made 2 points since yesterday, then today’s index would be 100 + 2/1 = 3

Vola Return Index vs. Percent Return Index

Lets have a look at a sample chart to compare the 2 ranking methods. I therefore picked the J.P.Morgan stock.

The upper indicator shows you a percent gain index. It sums up the daily percent gains of the stock movement, basically giving you an impression what you would have won when you would have kept your invested money constant.

The indicator on the bottom is the Vola Return Index. It represents your wins if you would have kept the risk invested into the stock constant. (=e.g. always invest 100$ on the 21 day 95%confidence interval of the daily returns)

Have a closer look at the differences of these two indicators up to October 2016. JPM is slightly up, and that`s why the percent change index is also in the positive area. During the same time the Vola Return Index just fluctuates around the zero line, as the volatility of JPM picked up during this period of time. To keep your risk invested constant over this period of time you would have downsized your position when JPMs volatility picked up, usually during a draw down. No good.

The same can be observed on the upper chart, showing the last months movements of the index. Right now, after the recent correction the percent change index is, like the JPM stock, up again. On the other side the Vola Return Index is still down, due to the rising volatility in JPM.

Vola Return Index – Ranking

Lets put this to a test and rank the 30 Dow Jones industrial stocks according to the percent return index and using my Vola Return Index as a comparison, calculated since 01/01/2015.

The first three stocks are the same, they got the highest vola and highest percent return. But JPM and Visa would get a different sorting. Just see how low the JPM Vola Index is, it would not be the 4th best stock.

Percent returns says JPM and Visa are abou the same, only the Vola Return Index shows that VISA would have been the better investment vehicle compared to JPM. But see for yourself on the chart…


Make sure your indicators show what you actually can do on the market. There is no use in just showing the percent gains of a stock if you trade some kind of VAR adjusted trading style.

Keeping you risk under control is one of the most important things in trading, and using the Vola Return Index instead of just plotting the percent performance can give you some key insights and keep you away from bad investment vehicles.


[code for tradesignal users]