Equity Returns Following Extreme VIX and WVF Movements, Part 1

Can extreme changes in implied volatility help predict future returns? And can we use a VIX surrogate as a substitute? First, let’s take a look at the WVF and its relationship to the VIX.

The Williams’ VIX Fix (WVF) is an indicator meant to roughly approximate the VIX. It can be useful in situations where there is no implied volatility index for the instrument we want to trade. The WVF is simply a measure of the distance between today’s close and the 22-day highest close; it is calculated as follows:

WVF Formula

A quick visual comparison between the VIX and WVF:

The WVF and VIX behave similarly during volatility spikes, but the WVF fails to emulate the VIX when it hovers at relatively low values. The correlation coefficient between VIX and WVF returns1 is 0.62, while regressing VIX returns on WVF returns using OLS results in an R2 statistic of 0.38.

We’re not going to be using the level of the VIX and WVF (hardcoding strategies to specific levels of the VIX is generally a terrible idea), so the above chart is somewhat useless for our purposes; we’re going to be looking at the 100-day percentile rank of the daily change. Here is a comparison over a couple of recent months:

Some times they move in lockstep, other times there seems to be almost no relation between them. Still, for such a simple indicator, I would say that the WVF does a fantastic job at keeping up with the VIX.

As you probably know, (implied) volatility is highly mean reverting. Extreme increases in the VIX tend to be followed by decreases. These implied volatility drops also tend to be associated with positive returns for equities. Let’s take a look at simple strategy to illustrate the point:

  • Buy SPY on close if the VIX percentage change today is the highest in 100 days.
  • Sell on the next close.

Here’s the equity curve and stats:

Nothing spectacular, but quite respectable. Somewhat inconsistent at times of low volatility, but over the long term it seems to be reliable. What about the same approach, but using the WVF instead?

The WVF outperforms the VIX! A somewhat surprising result…the equity curves look similar of course, with long periods of stagnation during low volatility times. Over the long term the stats are quite good, but we might be able to do better…

There is surprisingly little overlap between the VIX and WVF approaches. There are 96 signals from VIX movements, and 109 signals from WVF movements; in 48 instances both are triggered. These 48 instances however are particularly interesting. Here’s a quick breakdown of results depending on which signal has been triggered:

performance VIX WVF Both

Now this is remarkable. Despite performing better on its own, when isolated the VIX signal is completely useless. This is actually a very useful finding and extends to other similar situations: extreme volatility alone is not enough for an edge, but if used in combination with price-based signals, it can provide significant returns. I leave further combinations on this theme as an exercise for the reader.

A look at the equity curve of “both”:

both equity curve

Long SPY when VIX % change and WVF change are both the highest of the last 100 days, $100k per trade, 1993-2012, no commissions or dividends.

Now that’s just beautiful. You may say “but 37% over 20 years isn’t very impressive at all!”. And you’re right, it isn’t. But for a system that spends almost 99% of the time in cash, it’s fantastic. Want more trade opportunities? Let’s see what happens if we relax the limits on “extremeness”, from the 99th percentile through to the 75th:

extremeness tests

Net profit increases, but profitability per trade, and most importantly risk-adjusted returns suffer. The maximum drawdown increases at a much faster rate than net profits if we relax the limits. Still, there could be value in using even the 50th percentile not as a signal in itself, but (like the day of the month effects) as a slight long bias.

Finally, what if we vary the VIX and WVF limits independently of each other? Let’s have a cursory look at some charts:

As expected, the profit factor is highest at (0.99, 0.99), while net profits are highest at the opposite corner of (0.75, 0.75). It’s interesting to note however, that drawdown-adjusted returns are roughly the same both along the (0.99, 0.75-0.99) and (0.75-0.99, 0.99) areas; as long as one of the two is at the highest extremes, you can vary the other with little consequence in terms of risk-adjusted returns, while increasing net profits. This is definitely an area deserving of further analysis, but that’s for another post.

That’s it for now; I hope some of these ideas can be useful for you. In part 2 we’ll take a look at how the above concepts can be applied to international markets, where there is no direct relation to the VIX and there are no local implied volatility indices to use.

Footnotes
  1. In order to calculate returns for WVF I re-scaled it so the minimum value is 1 instead of 0, thus eliminating the problem of infinite/undefined results.[]

S&P 500 Returns Following New Lows (and Highs)

Today the S&P 500 closed at a 20-day low. Is there anything useful we can do with this piece of information? Let’s take a look at the performance of SPY after it closes at a 20-day low:

spy performance after 20 day low

 

Not particularly useful I’m afraid, just random variations around the average. What about other look-back lengths?

spy performance after x day low

 

Now this is more interesting. 60-day lows and up appear to have a bit of an edge, both for the day immediately after the low, as well as the medium term afterwards.

Let’s take a closer look at the returns after a 200-day low, with 95% confidence interval bands around them. Naturally, returns tend to be highly volatile around 200-day lows, which (combined with the small number of observations) means a very wide confidence interval.

spy performance after 200 day low

 

The 200-day low effect also seems to be prevalent in most equity indices, but without the regularity and strength that has been displayed by the S&P 500. Finally, what about new highs?

spy performance after x day high

Nothing to see here, move along! Slight underperformance compared to the average, but nowhere near enough to even consider shorting.

The Predictive Value of the Number and Magnitude of Recent Up/Down Days: UDIDSRI

Rummaging through my bottomless “TO DO” list, I found this little comment:

# of up/dn days in period, then re-scale that using percentrank….with net % move?

An interesting way to spend Sunday afternoon, and another blog post!

After playing around with the concept for a while, I ended up creating a simple indicator that, as it turns out, is impressively good at picking out bottoms1, and has very strong 1-day and acceptable medium-term predictive power. In an attempt to come up with the most awkward-sounding acronym possible, I decided on the name “Up/Down and Intensity Day Summation Rank Indicator”, or UDIDSRI. Here’s what I did:

The first iteration

I started out with a very simple concept:

  • If the day closes up, movement = 1, otherwise movement = -1.
  • Sum movement over the last 20 days.
  • UDIDSRI is the % rank of today’s sum, compared to the last 50 days of sums.

The case that presents the most interest is when UDIDSRI is equal to zero (i.e. the lowest value in 50 days), and we’ll have a look at how this works further down. I felt that this indicator could be significantly improved by adding a bit of nuance. Not all down days are equal, so I thought it would be a good idea to take into account the magnitude of the moves as well as their direction.

The second iteration

The second version of the algorithm:

  • If the day closes up, movement = 1, otherwise movement = -1.
  • Multiply movement by (1 +abs( return))5
  • Sum the movements for the last 20 days.
  • UDIDSRI is the % rank of today’s sum, compared to the last 50 days of sums.

The choice of the 5th power is completely arbitrary and un-optimized (as are the 20-day summation, and 50-day ranking) and can probably be optimized for better performance.

Here’s a chart comparing the two versions on the last few months of SPY (yellow is 1st iteration, red is 2nd):

UDIDSRI comparison chart

You can clearly see that the 2nd iteration doesn’t like to stay at 0 for so long, and tends to respond a bit faster to movements. As such, the 2nd iteration gives far fewer signals, but they’re of much higher quality. I’ll be using the 2nd version for the rest of this post.

UDIDSRI iteration comparison

Note that this approach is completely useless for going short. The indicator hitting its maximum value provides no edge either for mean reversion or trend following.

A quick test around the globe to make sure I wasn’t curve fitting:

all country ETF statistics

That turned out rather well…

Thus far we have only looked at the short-term performance of UDIDSRI. Let’s see how it does over the medium term after hitting 0:

medium term UDIDSRI performance

There seems to be a period of about 30 trading days after UDIDSRI hits 0, during which you can expect above-average returns. Let’s take a look at a strategy that crudely exploits this:

  • Buy SPY at close if UDIDSRI = 0.
  • Use a 2% trailing stop (close-to-close) to exit.

The trailing stop makes us exit quickly if we haven’t entered at a bottom, but stays with the trend if the entry is good. Here are the stats and equity curve for this strategy applied to SPY, using $100k per trade, without commissions or slippage:

medium term UDIDSRI SPY equity curve

medium term UDIDSRI SPY statistics

Finally here are some trades this strategy would have taken in 2011 and early 2012:

medium term UDIDSRI SPY statistics

The most significant problem with the tight trailing stop is that it exits at pullbacks instead of tops (which is particularly painful during heavy bear markets), so one easy improvement would be to add an indicator for over-extension and use that to time the exit. But I’ll leave that as homework.

All in all I’m quite satisfied with the UDIDSRI. I was really surprised at how it manages to pick bottoms with such accuracy, and I will definitely add it to the repertoire of signals that I use for swing trading.

If you want to play with the UDIDSRI yourself, I have uploaded an excel worksheet as well as the indicator and signal for MultiCharts .NET.

Footnotes
  1. For SPY, UDIDSRI gave signals on both the 2002 and 2009 lowest days[]

Tuesday’s QQQ Move and Equity Index Mean Reversion Worldwide

Tuesday saw QQQ drop somewhat heavily, and for the third day in a row.  These three drops took us back to mid-August levels, blowing through a ton of support levels… My mean reversion senses are tingling!

QQQ 2012

So, let’s take a look at what happens in these situations by formulating a simple rule to capture them, that will then exit after the mean reversion has (hopefully) happened:

  • If QQQ is down at least 3 days in a row, and it closes below the 10-day intraday low, go long.
  • Close the position one day after QQQ closes above its 5-day SMA.

A simple rule, designed to capture big drops in hope for a bounce. The usual issues associated with catching falling knives come up.

Sometimes it works great…

good trade

April 2012

And others not so much…

giant loss

September 2001

But over time there is remarkable consistency:

Equity curve

Run-up & drawdown, assuming $100k per trade, including $0.005 per contract in commissions

Here are the stats:

stats

Note: dot-com bubble not included in these stats because they would look better than they really are over the long run.

Downside skew is never nice of course, is there something we can do to soften the biggest losses? As it happens in most cases with swing systems, adding a stop loss is generally a bad idea. Indeed adding a simple 3% stop loss would decrease returns, lead to a much more uneven equity curve, and also result in deeper and longer drawdowns. Getting fancier and adding various special rules to the stop (such as a period after getting stopped out during which trading is not allowed) does not significantly improve the results. The solution here is simply proper position sizing so that you can take the losses you have to take and still be comfortable.

Finally, is this just an accidental feature of the NASDAQ 100, or could we use it in other markets as well? Let’s have a look at a broad array of equity index ETFs:

country stats

Trades start at each ETF’s inception; dividends are included, commissions are not.

Well, there you have it…

Disclosure: Net long U.S., U.K., Singaporean equities.

Portfolio Optimization Algorithm Showdown: GTAA Edition

I was revisiting the choice of portfolio optimization algorithm for the GTAA portion of my portfolio and thought it was an excellent opportunity for another post. The portfolio usually contains 5 assets (though at times it may choose fewer than 5) picked from a universe of 17 ETFs and mutual funds, which are picked by relative and absolute momentum. The specifics are irrelevant to this post as we’ll be looking exclusively at portfolio optimization techniques applied after the asset selection choices have been made.

Tactical asset allocation portfolios present different challenges from optimizing portfolios of stocks, or permanently diversified portfolios, because the mix of asset classes is extremely important and can vary significantly through time. Especially when using methods that weigh directly on volatility, bonds tend to have very large weights. During the last couple of decades this has been working great due to steadily dropping yields, but it may turn out to be dangerous going forward. I aim to test a wide array of approaches, from the crude equal weights, to the trendy risk parity, and the very fresh minimum correlation algorithm. Standard mean-variance optimization is out of the question because of its many and well-known problems, but mainly because forecasting returns is an exercise in futility.

The algorithms

The only restriction on the weights is no shorting; there are no minimum or maximum values.

  • Equal Weights

Self-explanatory.

  • Risk Parity (RP)

Risk parity (often confused with equal risk contribution) is essentially weighting proportional to the inverse of volatility (as measured by the 120-day standard deviation of returns, in this case). I will be using an unlevered version of the approach. I must admit I am still somewhat skeptical of the value of the risk parity approach for the bond-related reasons mentioned above.

  • Minimum Volatility (MV)

Minimum volatility portfolios take into account the covariance matrix and have weights that minimize the portfolio’s expected volatility. This approach has been quite successful in optimizing equity portfolios, partly because it indirectly exploits the low volatility anomaly. You’ll need a numerical optimization algorithm to solve for the minimum volatility portfolio.

  • MV (with shrinkage)

A note on shrinkage (not that kind of shrinkage!): one issue with algorithms that make use of the covariance matrix is estimation error. The number of covariances that must be estimated grows exponentially with the number of assets in the portfolio, and these covariances are naturally not constant through time. The errors in the estimation of these covariances have negative effects further down the road when we calculate the desired weightings.  A partial solution to this problem is to “shrink” the covariance matrix towards a “target matrix”. For more on the topic of shrinkage, as well as a description of the shrinkage approach I use here, see Honey, I Shrunk the Sample Covariance Matrix by Ledoit & Wolf.

  • Equal Risk Contribution (ERC)

The ERC approach is sort of an advanced version of risk parity that takes into account the covariance matrix of the assets’ returns (here‘s a quick comparison between the two). This difference results in significant complications when it comes to calculating weights, as you need to use a numerical optimization algorithm to minimize

formula

subject to the standard restrictions on the weights, where xis the weight of the ith asset, and (Σx)i denotes the ith row of the vector resulting from the product of Σ (the covariance matrix) and x (the weight vector). To do this I use MATLAB’s fmincon SQP algorithm.

For more on ERC, a good overview is On the Properties of Equally-Weighted Risk Contributions Portfolios by Maillard, et. al.

  • ERC (with shrinkage)

See above.

  • Minimum Correlation Algorithm (MCA)

A new optimization algorithm, developed by David Varadi, Michael Kapler, and Corey Rittenhouse. The main object of the MCA approach is to under-weigh assets with high correlations and vice versa, though it’s a bit more complicated than just weighting by the inverse of assets’ average correlation. If you’re interested in the specifics, check out the paper: The Minimum Correlation Algorithm: A Practical Diversification Tool.

The results

Moving on to the results, it quickly becomes clear that there isn’t much variation between the approaches. Most of the returns and risk management are driven by the asset selection process, leaving little room for the optimization algorithms to improve or screw up the results.

portfolio optimization algorithm stats

Predictably, the “crude” approaches such as equal weights or the inverse of maximum drawdown don’t do all that well. Not terribly by any means, but going up in complexity does seem to have some advantages. What stands out is that the minimum correlation algorithm outperforms the rest in both risk-adjusted return metrics I like to use.

Risk parity, despite its popularity, wallows in mediocrity in this test; its only redeeming feature being a bit of positive skew which is always nice to have.

The minimum volatility weights are an interesting case. They do what is says on the box: minimize volatility. Returns suffer consequently, but are excellent on a volatility-adjusted basis. On the other hand, the performance in terms of maximum drawdown is terrible. Some interesting features to note: the worst loss for the minimum volatility weights is by far the lowest of the pack: the worst day in over 15 years was -2.91%. This is accompanied by the lowest average time to recover from drawdowns, and an obscene (though also rather unimportant) longest winning streak of 22 days.

Finally, equal risk contribution weights almost match the performance of minimum volatility in terms of CAGR / St.Dev. while also giving us a lower drawdown. ERC also comes quite close to MCA; I would say it is the second-best approach on offer here.

A look at the equity curves below shows just how similar most of the allocations are. The results could very well be due to luck and not a superior algorithm.

GTAA portfolio optimization methods equity curves

To investigate further, I have divided the equity curves into three parts: 1996 – 2001, 2002-2007, and 2008-2012. Consistent results across these sub-periods would increase my confidence that the best algorithms actually provide value and weren’t just lucky.

subperiod stats

As expected there is significant variation in results between sub-periods. However, I believe these numbers solidify the value of the minimum correlation approach. If we compare it to its closest rival, ERC, minimum correlation comes out ahead in 2 out of 3 periods in terms of volatility-adjusted returns, and in 3 out of 3 periods in terms of drawdown-adjusted returns.

The main lesson here is that as long as your asset selection process and money/risk management are good, it’s surprisingly tough to seriously screw up the results by using a bad portfolio optimization approach. Nonetheless I was happily surprised to see minimum correlation beat out the other, more traditional, approaches, even though the improvement is marginal.

ETN Discount/Premium List

Here’s a list of ETNs with their average daily volumes and current premium/discount to their indicative value. I’m not a fan of trading truisms, but…the market can stay irrational longer than you can stay liquid. Implementation is everything when it comes to this sort of trade; find a way to ensure that there’s limited downside and a very large upside.

Day of the Month Seasonality Part 3: Nikkei 225, Hang Seng, STI

This is the third and final post investigating day of the month seasonality effects in global equity market indices. In part 1 we looked at U.S. indices; in part 2 we saw that the effects were even more powerful in three major European markets. In this post I will analyze three Asian indices: the Nikkei 225 (Japan), the Hang Seng Index (Hong Kong), and the Straits Times Index (Singapore).

 

The methodology:

As with the European indices, the Asian ones have relatively short histories. In order to get a long enough sample of results, I shortened the initial look-back period to 2500 trading days. The exact steps to re-create the results below are the following:

  1. Standardize every month to 21 trading days; round to the nearest integer when the number of days in a month is different.
  2. Start by using the last 2500 days; keep increasing the sample size until you reach 5000 days. After that use a moving window of the last 5000 days of daily returns and estimate the average return on every (standardized) day of the month.
  3. Rank the days by their past returns. If the next day is in the top 6, buy on close and sell on the next close.
  4. Move forward by one trading day and repeat from step 2.

A technical note: I am using QuantLib‘s holiday calendar functions to calculate the number of trading days in a particular month. There are problems with QuantLib, especially when looking further back in time, that result in an inaccurate trading day count for certain months. The effect is rather small as only a tiny number of months are affected, but the results should be even better if these problems were to be corrected.

 

The results:

Nikkei 225:

The equity curves:

Nikkei 225 day of the month seasonality results

And the statistics:

nikkei stats

Hang Seng Index:

The equity curves:

Hang Seng day of the month seasonality results

And the statistics:

hang seng stats

 

Straits Times Index:

The equity curves:

STI day of the month seasonality results

And the statistics:

STI stats

 

Calendars:

Here’s the updated list of average (standardized) day of the month returns over the last 5000 days for the indices we have looked at. The last and first few days of the month seem to be the best worldwide. The days around day #5 and day #15 seem to be the worst, again across the board. Beyond that there are few similarities among these markets.

calendar, average returns, all indices

 

Conclusions:

The main conclusion to be drawn from these results is simple:

The majority of permanent upwards stock market movements happen on a small number of days, and it is easy to predict which days these will be.

How can we use this knowledge? Setting up automatic investment plans to buy 4-5 days before the end of the month is one obvious implication. Going in too early or too late could have a significant negative impact on your returns over the long term.

If you swing trade any of these indices, it should take less to convince you to go long on these special days, and vice versa on the short side. Of course, there are long stretches of time during which the day of the month effect performs badly; it is not a trading rule in itself and as such should be treated with caution.

Shorting

Unfortunately there does not seem to be any consistent edge in day of the month seasonality for the short side. Given the general upward trend of equity markets over time, this is not all that surprising. It is possible that using a bear market filter we could uncover something useful, and I might revisit the topic in the future.

Day of the Month Seasonality Part 2: DAX, CAC 40, FTSE 100

In part 1 of the series I showed the impressive predictive power of day of the month seasonality effects in US equity markets. In this post I will apply the same type of analysis to three European markets: Germany (using the DAX index), France (using the CAC 40 index), and the U.K. (using the FTSE 100 index). Once again I must note that day of the month effects by themselves do not constitute a trading strategy, but I believe that the impressive returns predictability can be used to enhance other trading approaches, both systematic and discretionary.

 

The methodology:

The European indices have a far shorter history than the US ones, so we need to limit the look-back period in order to get a useful sample size. As such, I have slightly modified the approach to use less data in the early parts of the sample.

  1. Standardize every month to 21 trading days; round to the nearest integer when the number of days in a month is different.
  2. Start by using the last 2500 days; keep increasing the sample size until you reach 5000 days. After that use a moving window of the last 5000 days of daily returns and estimate the average return on every (standardized) day of the month.
  3. Rank the days by their past returns. If the next day is in the top 6, buy on close and sell on the next close.
  4. Move forward by one trading day and repeat from step 2.

The results:

DAX:

The equity curves:

DAX day of the month seasonality results

And the statistics:

DAX stats

 

CAC 40:

The equity curve:

CAC 40 day of the month seasonality results

And the statistics:

cac 40 stats

 

FTSE 100:

The equity curves:

FTSE 100 day of the month seasonality results

And the statistics:

ftse100 stats

 

Calendars:

daily returns calendar

One of the results that stands out is that the 15th day of the month seems to be absolutely terrible in every market. This is doubly peculiar because while the best days seem to be different for every market, the worst ones are consistent across the board. I would love to hear some theories about this.

In any case, day of the month seasonality effects are incredibly powerful in European markets as well. They’ve managed to provide positive returns in essentially all market environments. The fact that the best days occur on different parts of the month depending on which market you look at is amazing: it means there are more opportunities out there to exploit if you’ve got capital lying around.

In the next (and for now, final) part of the series, I will look how these seasonality effects hold up in Asian markets.

Day of the Month Seasonality Part 1: S&P 500, NASDAQ Composite, Russell 2000

My first post is inspired by the recent day of the month seasonality posts over at MarketSci (one, two). In this post I will show how day of the month seasonality applies to the S&P 500 as well as two other popular indices: the NASDAQ Composite and the Russell 2000.

 

The methodology:

  1. Standardize every month to 21 trading days; round to the nearest integer when the number of days in a month is different.
  2. Use the last 5000 days of daily returns and estimate the average return on every (standardized) day of the month.
  3. Rank the days by their past returns. If the next day is in the top 6, buy on close and sell on the next close.
  4. Move forward by one trading day and repeat from step 2.

As such, the approach is 100% walk-forward; there is no look-ahead bias in these results.

 

The results:

 S&P 500:

The sample starts in 1950; the results thus start in 1970.

S&P 500 day of the month seasonality results

 

A quick comparison of the statistics:

S&P500 stats

While the returns from the “Top 6” days are very impressive, they exhibit somewhat higher volatility, and can actually under-perform for very long periods of time.

 

NASDAQ Composite:

The equity curves:

NASDAQ Composite day of the month seasonality results

And the statistics:

nasdaq stats

 

Russell 2000:

The equity curves:

russell 2000 graph

And the statistics:

russell 2000 stats

The Russell 2000 stands out as rather strange: the Top 6 days did great during the bear market, but have been ineffective ever since. Of course, we only have a few years of useful data in this case, so it could very well be the case that we have stumbled on a period of under-performance by the day of the month effect.

Calendars:

Here are the actual statistics for the last 5000 days for each of the indices (with the top 6 days highlighted in bold):

daily stats

It’s both unexpected and quite interesting that despite extremely high correlations the last few years among these indices, there is significant variation between the optimal days for each one.

One common feature across all three indices is that the Top 6 days tend to be more volatile. Could it be that the day of the month effect is not an anomaly, but compensation for taking on more risk? Given the very small magnitude of volatility differences but significant differences in returns, I doubt it.

Another possible explanation that seems intuitive is institutional money flows. Yet it is difficult to justify that explanation when there are such large differences between the three indices: why would big money pile in to the Russell 2000 stocks, and out of the S&P 500 stocks on the last day of the month? For now, a good explanation of the effect eludes me…

Applicability:

By itself, day of the month seasonality is not a trading strategy. While there is substantial protection on the downside compared to buy & hold, commissions would completely destroy the returns. On the other hand, there seems to be potential in using the day of the month as an additional factor in an existing trading model or as an input in a discretionary swing trading approach.

As an example I have constructed a very simplistic trading strategy based on the S&P 500: go long if RSI(3) is below 5. I then filter the trades based on whether they are in one of the top 6 days (once again, the top 6 days are determined using walk-forward optimization so there is no look-ahead bias here).

The “RSI(3) < 5” rule by itself has an average daily return of 0.046% (0.259% since 2002); after using the filter, the rule returns 0.143% on average (0.458% since 2002). Overall using the filter, the strategy achieves roughly the same returns with less than a third of the trades.

RSI(3) Rules S&P500 Equity Curves

 

In the next parts I will investigate how these effects hold up in European and Asian Markets, and perhaps even non-equity markets.