Tag: programming

Blueprint for a Backtesting and Trading Software Suite

Posting has been slow lately because I’ve been busy with a bunch of other stuff, including the CFA Level 3 exam last weekend. I’ve also begun work on a very ambitious project: a fully-featured all-in-one backtesting and live trading suite, which is what prompted this post.

Over the last half year or so I’ve been moving toward more complex tools (away from excel, R, and MATLAB), and generally just writing standalone backtesters in C# for every concept I wanted to try out, only using Multicharts for the simplest ideas. This approach is, of course, incredibly inefficient, but the software packages available to “retail” traders are notoriously horrible, and I have nowhere near the capital I’d need to afford “real” tools like QuantFACTORY or Deltix.

The good thing about knowing how to code is that if a tool doesn’t exist you can just write it, and that’s exactly what I’m doing. Proper portfolio-level backtesting and live trading that’ll be able to easily do everything from intraday pairs trading to long term asset allocation and everything in-between, all under the same roof. On the other hand it’s also tailored to my own needs, and as such contains no plans for things like handling fundamental data. Most importantly it’s my dream research platform that’ll let me go from idea, to robust testing & optimization, to implementation very quickly. Here’s what the basic design looks like:

QBS-design-chart

What’s the point of posting about it? I know there are many other people out there facing the same issues I am, so hopefully I can provide some inspiration and ideas on how to solve them. Maybe it’ll prompt some discussion and idea-bouncing, or perhaps even collaboration.

Most of the essential stuff has already been laid down, so basic testing is already possible. A simple example based on my previous post can showcase some essential features. Below you’ll find the code behind the PatternFinder indicator, which uses the Accord.NET library’s k-d tree and k nearest neighbor algorithm implementation to do candlestick pattern searches as discussed here. Many elements are specific to my system, but the core functionality is trivially portable if you want to borrow it.

Note the use of attributes to denote properties as inputs, and set their default values. Options can be serialized/deserialized for easy storage in files or a database. Priority settings allow the user to specify the order of execution, which can be very important in some cases. Indexer access works with [0] being the current bar, [1] being the previous bar, etc. Different methods for historical and real time bars allow for a ton of optimization to speed up processing when time is scarce, though in this case there isn’t much that can be done.

showcase_classes

The VariableSeries class is designed to hold time series, synchronize them across the entire parent object, prevent data snooping, etc. The Indicator and Signal classes are all derived from VariableSeries, which is the basis for the system’s modularity. For example, in the PatternFinder indicator, OHLC inputs can be modified by the user through the UI, e.g. to make use of the values of an indicator rather than the instrument data.

showcase_options

The backtesting analysis stuff is still in its early stages, but again the foundations have been laid. Here are some stats using a two-day PatternFinder combined with IBS, applied on SPY:

showcase_patternfinderresults2[1]

Here’s the first iteration of the signal analysis interface. I have added 3 more signals to the backtest: going long for 1 day at every 15 day low close, the set-up Rob Hanna posted yesterday over at Quantifiable Edges (staying in for 5 days after the set-up appears), and UDIDSRI. The idea is to be able to easily spot redundant set-ups, find synergies or anti-synergies between signals, and easily get an idea of the marginal value added by any one particular signal.

showcase_signals

And here’s some basic Monte Carlo simulation stuff, with confidence intervals for cumulative returns and PDF/CDF of the maximum drawdown distribution:

showcase_mc

Here’s the code for the PatternFinder indicator. Obviously it’s written for my platform, but it should be easily portable. The “meat” is all in CalcHistorical() and GetExpectancy().

/// <summary>
/// K nearest neighbor search for candlestick patterns
/// </summary>
public class PatternFinder : Indicator
{
    [Input(3)]
    public int PatternLength { get; set; }

    [Input(75)]
    public int MatchCount { get; set; }

    [Input(2000)]
    public int MinimumWindowSize { get; set; }

    [Input(false)]
    public bool VolatilityAdjusted { get; set; }

    [Input(false)]
    public bool Overnight { get; set; }

    [Input(false)]
    public bool WeighExpectancyByDistance { get; set; }

    [Input(false)]
    public bool Classification { get; set; }

    [Input(0.002)]
    public double ClassificationLimit { get; set; }

    [Input("Euclidean")]
    public string DistanceType { get; set; }

    [SeriesInput("Instrument.Open")]
    public VariableSeries<decimal> Open { get; set; }

    [SeriesInput("Instrument.High")]
    public VariableSeries<decimal> High { get; set; }

    [SeriesInput("Instrument.Low")]
    public VariableSeries<decimal> Low { get; set; }

    [SeriesInput("Instrument.Close")]
    public VariableSeries<decimal> Close { get; set; }

    [SeriesInput("Instrument.AdjClose")]
    public VariableSeries<decimal> AdjClose { get; set; }

    private VariableSeries<double> returns;
    private VariableSeries<double> stDev;
    private KDTree<double> _tree;

    public PatternFinder(QSwing parent, string name = "PatternFinder", int BarsCount = 1000)
        : base(parent, name, BarsCount)
    {
        Priority = 1;
        returns = new VariableSeries<double>(parent, BarsCount);
        stDev = new VariableSeries<double>(parent, BarsCount) { DefaultValue = 1 };
    }

    internal override void Startup()
    {
        _tree = new KDTree<double>(PatternLength * 4 - 1);
        switch (DistanceType)
        {
            case "Euclidean":
                _tree.Distance = Accord.Math.Distance.Euclidean;
                break;
            case "Absolute":
                _tree.Distance = AbsDistance;
                break;
            case "Chebyshev":
                _tree.Distance = Accord.Math.Distance.Chebyshev;
                break;
            default:
                _tree.Distance = Accord.Math.Distance.Euclidean;
                break;
        }
    }

    public override void CalcHistorical()
    {
        if (VolatilityAdjusted && CurrentBar > 0)
            returns.Value = (double)(AdjClose[0] / AdjClose[1] - 1);

        if (VolatilityAdjusted && CurrentBar > 11)
            stDev.Value = returns.StandardDeviation(10);

        if (CurrentBar < PatternLength + 1) return;

        if (CurrentBar > MinimumWindowSize)
            Value = GetExpectancy(GetCoords());

        double ret = Overnight ? (double)(Open[0] / Close[1] - 1) : (double)(AdjClose[0] / AdjClose[1] - 1);
        double adjret = ret / stDev[0];

        if (Classification)
            _tree.Add(GetCoords(1), adjret > ClassificationLimit ? 1 : 0);
        else
            _tree.Add(GetCoords(1), adjret);
    }

    public override void CalcRealTime()
    {
        if (VolatilityAdjusted && CurrentBar > 0)
            returns.Value = (double)(AdjClose[0] / AdjClose[1] - 1);

        if (VolatilityAdjusted && CurrentBar > 11)
            stDev.Value = returns.StandardDeviation(10);

        if (CurrentBar > MinimumWindowSize)
            Value = GetExpectancy(GetCoords());
    }

    private double GetExpectancy(double[] coords)
    {
        if (!WeighExpectancyByDistance)
            return _tree.Nearest(coords, MatchCount).Average(x => x.Node.Value) * stDev[0];
        else
        {
            var nodes = _tree.Nearest(coords, MatchCount);
            double totweight = nodes.Sum(x => 1 / Math.Pow(x.Distance, 2));
            return nodes.Sum(x => x.Node.Value * ((1 / Math.Pow(x.Distance, 2)) / totweight)) * stDev[0];
        }
    }

    private static double AbsDistance(double[] x, double[] y)
    {
        return x.Select((t, i) => Math.Abs(t - y[i])).Sum();
    }

    private double[] GetCoords(int offset = 0)
    {
        double[] coords = new double[PatternLength * 4 - 1];
        for (int i = 0; i < PatternLength; i++)
        {
            coords[4 * i] = (double)(Open[i + offset] / Close[i + offset]);
            coords[4 * i + 1] = (double)(High[i + offset] / Close[i + offset]);
            coords[4 * i + 2] = (double)(Low[i + offset] / Close[i + offset]);

            if (i < PatternLength - 1)
                coords[4 * i + 3] = (double)(Close[i + offset] / Close[i + 1 + offset]);
        }
        return coords;
    }
}

Coming up Soon™: a series of posts on cross validation, an in-depth paper on IBS, and possibly a theory-heavy paper on the low volatility effect.

Read more Blueprint for a Backtesting and Trading Software Suite