I have gotten a couple of emails asking me about the topic of analyzing performance, so I decided to detail the tools I use. Measuring performance and attributing success or failure to the right factors is an extremely important part of the trading process. Actually trading a strategy will often reveal aspects that don’t come up in the research stage. Unexpected things happen, revealing previously hidden strengths or weaknesses. Strategies improve or deteriorate through time. Execution issues eat into returns. Patterns emerge that can be exploited to enhance returns or limit risk.
These situations, and performance evaluation in general, are a crucial part of the research/trading/performance loop:
A lack of attention to performance, and the underlying factors that drive it, will have a deleterious effect both on your long-term trading results and the things that you will discover in the research stage.
I’ll demonstrate the tools using two strategies, one of which has been going well, and the other not: 1) a rather generic GTAA momentum/trend-following strategy that has been running for a bit over a year, and 2) an AAPL swing trading strategy that’s been in “trial” mode for the last 6 months or so.
My performance analysis system, the QUSMA Portfolio and Trade Analytics Suite, is primarily based around the concept of a “trade”. A trade is a unit that can contain any number of orders and cash transactions (dividends, taxes, etc.), which are somehow related. A pair trade would include both legs in a single trade, for example. The underlying data is imported using IB’s flex queries which have a very simple and easy to handle XML structure.
Trades are assigned to a “strategy” and can also be assigned any number of tags. Some of the things that I use tags for are: trade direction (long/short/both), trade length, developed/developing country, asset class, etc. Notes with images can also be attached to trades, which is incredibly useful for reviews. Finally, the trades can be filtered on any number of criteria to produce reports, and compared against custom benchmarks.
There are some general principles that summarize my approach to performance measurement:
- Execution and commissions are extremely important.
- Separate timing from sizing.
- Statistics on trades in both dollar terms and % terms.
- Separate capital allocations to strategies from total capital.
- Statistics on returns both on capital allocated to a strategy (ROAC) and on total capital (ROTC).
- Always think probabilistically and in terms of expectations
- The more ways you can find to look at the data, the better.
Simple visual inspection is my starting point, and I think it’s very important. The simple act of staring at charts often leads to new research ideas.
So let’s get started with the graphs and stats. At the top, the standard dollar PnL (daily and close-to-close) and equity curves (both in terms of ROAC and ROTC), which are also plotted against a benchmark:
Next up are the trade statistics. Commissions are right up there, it’s very important to keep in mind how much you are losing in those costs. A few basis points may not seem like much, but they can quickly eat up a significant portion of your profits. Note all the stats are given both in dollar and percentage terms, in order to separate timing effects from sizing effects.
Results by calendar month:
Probably the most important bit, statistics on daily returns, the standard ratios, and so forth. The MAR ratio is probably the most important number for me. The reason is simple: it determines my leverage constraints, and thus my returns. A high Sharpe ratio is meaningless if you can’t lever up. Note how the simple, static benchmark portfolio has destroyed the GTAA approach:
Some simple benchmarking stuff:
Histograms of daily returns, and returns per trade. Again, it’s important to look at both dollar and percentage results:
Also, holding period histogram:
Position sizing vs trade returns. Naive risk parity seems to be doing alright:
Trade length vs returns chart, the relationship here is pretty clear.
The movement capture stats measure how good the strategy is at capturing returns. GU is gross upside, or the gross positive returns during the period. UC% is the percentage of that movement that was captured by being long, UM% is the percentage of the movement that was missed by being flat, while UL% is the percentage of the movement that was lost due to being short. The calculations are repeated for downside movement.
Cumulative percent returns, by instrument. A similar chart with dollar PnL by instrument also exists.
Standard value at risk calculations, based on resampled historical data. I’ll be adding the option to use parametric methods in the future.
Monte Carlo simulation. It simply uses historical data, either trades or daily returns (either ROAC or ROTC). Sampling can be done with replacement or without (the latter simply re-orders the existing equity curve). There is also an option to use N consecutive days/trades, which can capture volatility clustering and autocorrelation effects. The analysis returns confidence intervals for the equity curve, as well as the cumulative and point distributions of maximum drawdowns.
Finally, some simple stats and charts on execution. All of my trades are either at the close or the open, so those are the prices I benchmark against. Below are stats from the AAPL strategy’s buy orders around the close.
I think that the biggest weakness in my toolset is the lack of interaction with backtesting results. These can be used in two main ways: 1) comparing theoretical results to real trading results, and 2) as an extended dataset for the risk management functions. Also, I don’t do any stock picking, but if I did that would entail several additions, mainly performance attribution by country, sector, etc. as well as analyzing value/size/momentum factor exposures.
Leave a comment and tell us what you like to use: is the standard stuff enough for you, or do you use any obscure ratios or unique charts?