k-NN Candlestick Pattern Search Extensions: More Data

This is a followup to the Mining for Three Day Candlestick Patterns post. If you haven’t read the original post, do so now because I’m not going to repeat the basic mechanics of the strategy. While the approach was somewhat fruitful, it also had some obvious problems: it only seems to work in bearish or high volatility market regimes, and it couldn’t produce good short signals. The main idea I had to resolve these issues was simply to get more data.

equity curves with without IBS

Original strategy using only SPY data. Note long stretches of flat results.

That is easier said than done. Could we use mutual funds or index values to extend the dataset backwards? No, because the daily high/low values are inaccurate. The only alternative we are left with is using data from other instruments. So I picked a broad selection of equity ETFs to include: EWY, EWD, EWC, EWQ, EWU, EWA, EWP, EWH, EWL, EFA, EPP, EWM, EWI, EWG, EWO, IWM, QQQ, EWS, EWT, and EWJ.

The selection was comprehensive and unoptimized. I think you could do some sort of walk-forward optimization that picks the best combination of securities to include in the data set. I’m not sure how much that would help.

The additional data worked fantastically well, resolving both problems. The number of opportunities to trade increased significantly, long signals work very nicely under all market conditions, and predicting negative returns works far better. There was also an unexpected benefit: far less time is needed before the forecasts become usable. In the original implementation I waited 2000 days before starting to use the forecasts. With the extended data set this can be cut to 500, thus letting the backtest cover a longer period.

Performance-wise there were no problems, as the Accord .NET k-d tree implementation that I use is very quick. Finding the nearest 75 points in a data set of approximately 100,000, in 11 dimensions, takes less than 2 milliseconds on my overclocked 2500K.

The settings used in the search are simple: the length of the patterns is 3 days, the 75 closest ones are used to construct a forecast by averaging their next-day returns, and distance is calculated as the sum of squared distances in every dimension. Trades are taken when the forecast is above/below a certain threshold. They are then passed through a filter which only allows long positions when IBS < 0.5 and short positions only when IBS > 0.5.

It should be noted that using traditional measures of “fit” does not work very well with pattern matching. Adding the above instruments actually increases the RMSE, despite significantly increasing the trading performance of the forecasts.

A look at forecasts vs realized next-day returns:

PatternFinderMultiInput (x-axes) vs next day returns (y-axies), when IBS < 0.5

PatternFinderMultiInput (x-axes) vs next day returns (y-axies), for IBS < 0.5 and forecast > 0

An important aspect to note is that even marginally positive forecasts work very well. For example, with the extended dataset, forecasts between 5 and 10 basis points resulted in an average 21 bp return the next day. On the other hand, using SPY data only, the return for those forecasts was just 5 basis points. What this means is that there are many more trades to take, which is what allows the strategy to do well in all market environments. Here’s the long-only equity curve:

Long position taken when IBS  5 basis points. $0.005 per share in commissions.

Long position taken when IBS < 0.5 and forecast > 5 basis points. $0.005 per share in commissions.

 

A couple of charts to analyze the sensitivity of the long-only strategy’s results to changes in inputs (IBS limit and minimum forecast limit):

sensitivity analysis

The additional data also has the benefit of making shorting possible. The equity curve doesn’t look as good, but it’s still a giant improvement over zero predictive ability on the short side:

multi input short only

Short position taken when IBS > 0.5 and forecast < -20 basis points. $0.005 per share in commissions.

 

Finally, the long and short strategies combined, along with the stats:

multi input long short

Long and short strategies above combined. $0.005 per share in commissions.

stats

 

The concept also seems to work for stocks. For example, I tested a long-only strategy on AAPL, using the same settings as above, both with and without the addition of MSFT data. The Microsoft data improved every aspect of the results, with surprisingly consistent performance over nearly 20 years:

AAPLMSFT

It would be interesting to try to apply this on a more massive scale, by increasing the data set to something like all S&P 500 stocks. Some technical restrictions prevent me from doing that right now, but I’ll come back to the idea in the future.

Closing Price in Relation to the Day’s Range, and Equity Index Mean Reversion

UPDATE: read The IBS Effect: Mean Reversion in Equity ETFs instead of this post, it features more recent data and deeper analysis.

The location of the closing price within the day’s range is a surprisingly powerful predictor of next-day returns for equity indices. The closing price in relation to the day’s range (or CRTDR [UPDATE: as reader Jan mentioned in the comments, there is already a name for this: Internal Bar Strength or IBS] if you’re a fan of unpronounceable acronyms) is simply calculated as such:

CRTDR formula

It takes values between 0 and 1 and simply indicates at which point along the day’s range the closing price is located. In this post I will take a look not only at returns forecasting, but also how to use this value in conjunction with other indicators. You may be skeptical about the value of something so extremely simplistic, but I think you’ll be pleasantly surprised.

The basics: QQQ and SPY

First, a quick look at QQQ and SPY next-day returns depending on today’s CRTDR:

SPY CRTDR

QQQ CRTDR

A very promising start. Now the equity curves for each quartile:

spy quartile ECs

QQQ quartile ECs

That’s quite good; consistency through time and across assets is important and we’ve got both in this case. The magnitude of the out-performance of the bottom quartile is very large; I think we can do something useful with it.

There are several potential improvements to this basic approach: using the range of several days instead of only the last one, adjusting for the day’s close-to-close return, and averaging over several days are a few of the more obvious routes to explore. However, for the purposes of this post I will simply continue to use the simplest version.

CRTDR Internationally

A quick look across a larger array of assets, which is always an important test (here I also incorporate a bit of shorting):

All ETF results

Long when CRTDR < 45%, short when CRTDR > 95%. $10k per trade. Including commissions of $0.005 per share, excluding dividends.

One question that comes up when looking at ETFs of foreign indices is about the effect of non-overlapping trading hours. Would we be better off using the ETF trading hours or the local trading hours to determine the range and out predictions? Let’s take a look at the EWU ETF (iShares MSCI United Kingdom Index Fund) vs the FTSE 100 index, with the following strategy:

  • Go long on close if CRTDR < 45%
  • Go short on close if CRTDR > 95%
FTSE vs EWU

FTSE vs EWU CRTDR strategy, 1996-2012. $1m per trade (the number was a technical necessity due to the price of the FTSE 100 index).

Fascinating! This result left me completely stumped. I would love to hear your ideas about this…I have a feeling that there must be some sort of explanation, but I’m afraid I can’t come up with anything realistic.

Trading Signal or Filter?

It should be noted that I don’t actually use the CRTRD as a signal to take trades at all. Given the above results you may find this surprising, but all the positive returns are already captured by other, similar (and better), indicators (especially short-term price-based indicators such as RSI(3)). Instead I use it in reverse: as a filter to exclude potential trades. To demonstrate, let’s have a look at a very simplistic mean reversion system:

  • Buy QQQ at close when RSI(3) < 10
  • Sell QQQ at close when RSI(3) > 50

On average, this will result in a daily return of 0.212%. So we have two approaches in our hands that both have positive expectancy, what happens if we combine them?

  • Go long either on the RSI(3) criteria above OR CRTDR < 50%
QQQ RSI and RSI with CRTDR

RSI(3) and RSI(3) w/ CRTDR strategy applied to QQQ. Commissions not included.

This is a bit surprising: putting together two systems, both of which have positive expectancy, results in significantly lower returns. At this point some may say “there’s no value to be gained here”. But fear not, there are significant returns to be wrung out of the CRTDR! Instead of using it as a signal, what if we use it in reverse as a filter? Let’s investigate further: what happens if we split these days up by CRTDR?

RSI signal returns by CRTDR

Now that’s quite interesting. Combining them has very bad results, but instead we have an excellent method to filter out bad RSI(3) trades. Let’s have a closer look at the interplay between RSI(3) signals and CRTDR:

RSI CRTDR square

Next-day QQQ returns.

And now the equity curves with and without the CRTDR < 50% filter:

QQQ RSI and RSI with CRTDR filter

RSI(3) and RSI(3) w/ CRTDR < 50% filter applied to QQQ. Commissions not included.

That’s pretty good. Consistent performance and out-performance relative to the vanilla RSI(3) strategy. Not only that, but we have filtered out over 35% of trades which not only means far less money spent on commissions, but also frees up capital for other trades.

UPDATE: I neglected to mention that I use Cutler’s RSI and not the “normal” one, the difference being the use of simple moving averages instead of exponential moving averages. I have also uploaded an excel sheet and Multicharts .net signal code that replicate most of the results in the post.

The Predictive Value of the Number and Magnitude of Recent Up/Down Days: UDIDSRI

Rummaging through my bottomless “TO DO” list, I found this little comment:

# of up/dn days in period, then re-scale that using percentrank….with net % move?

An interesting way to spend Sunday afternoon, and another blog post!

After playing around with the concept for a while, I ended up creating a simple indicator that, as it turns out, is impressively good at picking out bottoms1, and has very strong 1-day and acceptable medium-term predictive power. In an attempt to come up with the most awkward-sounding acronym possible, I decided on the name “Up/Down and Intensity Day Summation Rank Indicator”, or UDIDSRI. Here’s what I did:

The first iteration

I started out with a very simple concept:

  • If the day closes up, movement = 1, otherwise movement = -1.
  • Sum movement over the last 20 days.
  • UDIDSRI is the % rank of today’s sum, compared to the last 50 days of sums.

The case that presents the most interest is when UDIDSRI is equal to zero (i.e. the lowest value in 50 days), and we’ll have a look at how this works further down. I felt that this indicator could be significantly improved by adding a bit of nuance. Not all down days are equal, so I thought it would be a good idea to take into account the magnitude of the moves as well as their direction.

The second iteration

The second version of the algorithm:

  • If the day closes up, movement = 1, otherwise movement = -1.
  • Multiply movement by (1 +abs( return))5
  • Sum the movements for the last 20 days.
  • UDIDSRI is the % rank of today’s sum, compared to the last 50 days of sums.

The choice of the 5th power is completely arbitrary and un-optimized (as are the 20-day summation, and 50-day ranking) and can probably be optimized for better performance.

Here’s a chart comparing the two versions on the last few months of SPY (yellow is 1st iteration, red is 2nd):

UDIDSRI comparison chart

You can clearly see that the 2nd iteration doesn’t like to stay at 0 for so long, and tends to respond a bit faster to movements. As such, the 2nd iteration gives far fewer signals, but they’re of much higher quality. I’ll be using the 2nd version for the rest of this post.

UDIDSRI iteration comparison

Note that this approach is completely useless for going short. The indicator hitting its maximum value provides no edge either for mean reversion or trend following.

A quick test around the globe to make sure I wasn’t curve fitting:

all country ETF statistics

That turned out rather well…

Thus far we have only looked at the short-term performance of UDIDSRI. Let’s see how it does over the medium term after hitting 0:

medium term UDIDSRI performance

There seems to be a period of about 30 trading days after UDIDSRI hits 0, during which you can expect above-average returns. Let’s take a look at a strategy that crudely exploits this:

  • Buy SPY at close if UDIDSRI = 0.
  • Use a 2% trailing stop (close-to-close) to exit.

The trailing stop makes us exit quickly if we haven’t entered at a bottom, but stays with the trend if the entry is good. Here are the stats and equity curve for this strategy applied to SPY, using $100k per trade, without commissions or slippage:

medium term UDIDSRI SPY equity curve

medium term UDIDSRI SPY statistics

Finally here are some trades this strategy would have taken in 2011 and early 2012:

medium term UDIDSRI SPY statistics

The most significant problem with the tight trailing stop is that it exits at pullbacks instead of tops (which is particularly painful during heavy bear markets), so one easy improvement would be to add an indicator for over-extension and use that to time the exit. But I’ll leave that as homework.

All in all I’m quite satisfied with the UDIDSRI. I was really surprised at how it manages to pick bottoms with such accuracy, and I will definitely add it to the repertoire of signals that I use for swing trading.

If you want to play with the UDIDSRI yourself, I have uploaded an excel worksheet as well as the indicator and signal for MultiCharts .NET.

Footnotes
  1. For SPY, UDIDSRI gave signals on both the 2002 and 2009 lowest days[]

Tuesday’s QQQ Move and Equity Index Mean Reversion Worldwide

Tuesday saw QQQ drop somewhat heavily, and for the third day in a row.  These three drops took us back to mid-August levels, blowing through a ton of support levels… My mean reversion senses are tingling!

QQQ 2012

So, let’s take a look at what happens in these situations by formulating a simple rule to capture them, that will then exit after the mean reversion has (hopefully) happened:

  • If QQQ is down at least 3 days in a row, and it closes below the 10-day intraday low, go long.
  • Close the position one day after QQQ closes above its 5-day SMA.

A simple rule, designed to capture big drops in hope for a bounce. The usual issues associated with catching falling knives come up.

Sometimes it works great…

good trade

April 2012

And others not so much…

giant loss

September 2001

But over time there is remarkable consistency:

Equity curve

Run-up & drawdown, assuming $100k per trade, including $0.005 per contract in commissions

Here are the stats:

stats

Note: dot-com bubble not included in these stats because they would look better than they really are over the long run.

Downside skew is never nice of course, is there something we can do to soften the biggest losses? As it happens in most cases with swing systems, adding a stop loss is generally a bad idea. Indeed adding a simple 3% stop loss would decrease returns, lead to a much more uneven equity curve, and also result in deeper and longer drawdowns. Getting fancier and adding various special rules to the stop (such as a period after getting stopped out during which trading is not allowed) does not significantly improve the results. The solution here is simply proper position sizing so that you can take the losses you have to take and still be comfortable.

Finally, is this just an accidental feature of the NASDAQ 100, or could we use it in other markets as well? Let’s have a look at a broad array of equity index ETFs:

country stats

Trades start at each ETF’s inception; dividends are included, commissions are not.

Well, there you have it…

Disclosure: Net long U.S., U.K., Singaporean equities.