My Performance Analysis Tools

I have gotten a couple of emails asking me about the topic of analyzing performance, so I decided to detail the tools I use. Measuring performance and attributing success or failure to the right factors is an extremely important part of the trading process. Actually trading a strategy will often reveal aspects that don’t come up in the research stage. Unexpected things happen, revealing previously hidden strengths or weaknesses. Strategies improve or deteriorate through time. Execution issues eat into returns. Patterns emerge that can be exploited to enhance returns or limit risk.

These situations, and performance evaluation in general, are a crucial part of the research/trading/performance loop:

loop

A lack of attention to performance, and the underlying factors that drive it, will have a deleterious effect both on your long-term trading results and the things that you will discover in the research stage.

I’ll demonstrate the tools using two strategies, one of which has been going well, and the other not: 1) a rather generic GTAA momentum/trend-following strategy that has been running for a bit over a year, and 2) an AAPL swing trading strategy that’s been in “trial” mode for the last 6 months or so.

My performance analysis system, the QUSMA Portfolio and Trade Analytics Suite, is primarily based around the concept of a “trade”. A trade is a unit that can contain any number of orders and cash transactions (dividends, taxes, etc.), which are somehow related. A pair trade would include both legs in a single trade, for example. The underlying data is imported using IB’s flex queries which have a very simple and easy to handle XML structure.

Trades are assigned to a “strategy” and can also be assigned any number of tags. Some of the things that I use tags for are: trade direction (long/short/both), trade length, developed/developing country, asset class, etc. Notes with images can also be attached to trades, which is incredibly useful for reviews. Finally, the trades can be filtered on any number of criteria to produce reports, and compared against custom benchmarks.

A trade and its two associated orders.

A trade and its two associated orders.

There are some general principles that summarize my approach to performance measurement:

  • Execution and commissions are extremely important.
  • Separate timing from sizing.
    • Statistics on trades in both dollar terms and % terms.
  • Separate capital allocations to strategies from total capital.
    • Statistics on returns both on capital allocated to a strategy (ROAC) and on total capital (ROTC).
  • Always think probabilistically and in terms of expectations
  • The more ways you can find to look at the data, the better.

 

Simple visual inspection is my starting point, and I think it’s very important. The simple act of staring at charts often leads to new research ideas.

GTAA strategy: a number of losing trades in TLT.

GTAA strategy: a number of losing trades in TLT.

 

So let’s get started with the graphs and stats. At the top, the standard dollar PnL (daily and close-to-close) and equity curves (both in terms of ROAC and ROTC), which are also plotted against a benchmark:

ROAC EC

GTAA strategy cumulative returns on allocated capital. Chart also comes in ROTC flavor.

PnL

AAPL strategy cumulative PnL.

 

Next up are the trade statistics. Commissions are right up there, it’s very important to keep in mind how much you are losing in those costs. A few basis points may not seem like much, but they can quickly eat up a significant portion of your profits. Note all the stats are given both in dollar and percentage terms, in order to separate timing effects from sizing effects.

AAPL strategy.

AAPL strategy.

 

Results by calendar month:

AAPL strategy. Also comes in ROTC flavor.

AAPL strategy. Also comes in ROTC flavor.

 

Probably the most important bit, statistics on daily returns, the standard ratios, and so forth. The MAR ratio is probably the most important number for me. The reason is simple: it determines my leverage constraints, and thus my returns. A high Sharpe ratio is meaningless if you can’t lever up. Note how the simple, static benchmark portfolio has destroyed the GTAA approach:

GTAA strategy.

GTAA strategy. Benchmark is a 20/15/15/20/10/10/10 % mix of SPY/EFA/EEM/IEF/LQD/VNQ/DBC respectively. Stats are also available for ROTC.

 

Some simple benchmarking stuff:

GTAA strategy vs diversified benchmark.

GTAA strategy vs diversified benchmark.

 

Histograms of daily returns, and returns per trade. Again, it’s important to look at both dollar and percentage results:

AAPL strategy.

AAPL strategy.

GTAA strategy.

GTAA strategy.

 

Also, holding period histogram:

AAPL strategy.

AAPL strategy.

 

Position sizing vs trade returns. Naive risk parity seems to be doing alright:

GTAA strategy.

GTAA strategy.

 

Trade length vs returns chart, the relationship here is pretty clear.

GTAA strategy.

GTAA strategy.

 

The movement capture stats measure how good the strategy is at capturing returns. GU is gross upside, or the gross positive returns during the period. UC% is the percentage of that movement that was captured by being long, UM% is the percentage of the movement that was missed by being flat, while UL% is the percentage of the movement that was lost due to being short. The calculations are repeated for downside movement.

GTAA strategy.

GTAA strategy. Being long-only, only upside movement has been captured.

 

Cumulative percent returns, by instrument. A similar chart with dollar PnL by instrument also exists.

GTAA strategy.

GTAA strategy.

 

Autocorrelation and partial autocorrelation stats based on daily returns:

GTAA strategy.

GTAA strategy. High autocorrelation values can be exploited both to enhance returns and for risk management.

 

Standard value at risk calculations, based on resampled historical data. I’ll be adding the option to use parametric methods in the future.

VaR

GTAA strategy: 10-day value at risk.

 

Monte Carlo simulation. It simply uses historical data, either trades or daily returns (either ROAC or ROTC). Sampling can be done with replacement or without (the latter simply re-orders the existing equity curve). There is also an option to use N consecutive days/trades, which can capture volatility clustering and autocorrelation effects. The analysis returns confidence intervals for the equity curve, as well as the cumulative and point distributions of maximum drawdowns.

There is a 10% chance of a drawdown worse than 18% in the next 500 trading days.

GTAA strategy: there is a 10% chance of a drawdown worse than 18% in the next 500 trading days.

 

Finally, some simple stats and charts on execution. All of my trades are either at the close or the open, so those are the prices I benchmark against. Below are stats from the AAPL strategy’s buy orders around the close.

execution stats

Top: slippage vs time difference in seconds from benchmark. Middle: slippage by order type. Bottom: Slippage histogram.

Top: slippage vs time difference in seconds from benchmark. Middle: slippage by order type. Bottom: Slippage histogram.

 

I think that the biggest weakness in my toolset is the lack of interaction with backtesting results. These can be used in two main ways: 1) comparing theoretical results to real trading results, and 2) as an extended dataset for the risk management functions. Also, I don’t do any stock picking, but if I did that would entail several additions, mainly performance attribution by country, sector, etc. as well as analyzing value/size/momentum factor exposures.

Leave a comment and tell us what you like to use: is the standard stuff enough for you, or do you use any obscure ratios or unique charts?