Portfolio Backtesting

3. In-sample versus out-of-sample backtesting

[this page | pdf | references | back links]

Return to Abstract and Contents

Next page


3.1          Short-cutting the future by referring merely to the past introduces look-back bias. Exactly how this works out in practice depends on how the backtesting is carried out.


3.2          One way of carrying out a backtest would be to take a single model of how the future might evolve and then to apply the same model to every prior period. This is called in-sample backtesting. The key issue with such an approach is that the model will typically have been formulated by reference to past history including the past that we are then testing the model against. Thus, unless we have been particularly inept at fitting the past when constructing the risk model in the first place, we should find that it is a reasonable fit in an in-sample, i.e. ex-post, comparison. We cannot then conclude much from its apparent goodness of fit.


3.3          Backtesters attempt to mitigate this problem by using so-called out-of-sample testing. What this involves is a specification of how to construct a model using data only available up to a particular point in time. We then apply the model construction algorithm only to observations that occurred after the end of the sample period used in the estimation of the model, i.e. out of the sample in question. The model might be estimated once-off using a particular earlier period of time and then the same model might be applied each time period thereafter. Alternatively, the model might be re-estimated at the start of each time period using data that would have then been available, so that the time period then just about occur is still (just) after the in-sample period.


3.4          Whilst out-of-sample modelling does reduce look-back bias it does not eliminate it. Risk models ultimately involve lots of different assumptions about how the future might evolve, not least the format of the risk model itself. In the background there are lots of competing risk models that we might have considered suitable for the problem. Not too surprisingly, the only ones that actually see the light of day, and therefore get formally assessed in an out-of-sample context, are ones that are likely to be tolerably good at fitting the past even in an out-of-sample context. Risk modellers are clever enough to winnow out ones that will obviously fail such a test before the test is actually carried out. This point is perhaps more relevant to backtesting of return generating algorithms, given the human tendency to rationalise explanations for success or failure, perhaps even if there is no such explanation, see e.g. Taleb (2004).


Contents | Prev | Next

Desktop view | Switch to Mobile