financialnoob.me

Blog about quantitative finance

Pairs trading with Ornstein-Uhlenbeck process (Part 1)

In this article I will implement and backtest a strategy based on the paper ‘Statistical Arbitrage in the U.S. Equities Market’ (Avellaneda and Lee, 2008). The paper describes the application of Ornstein-Uhlenbeck process for modelling the cointegration residual (spread) in statistical arbitrage strategies. 


Statistical arbitrage is basically a generalized version of pairs trading, where instead of trading one stock against another stock, we trade one portfolio of stocks agains another portfolio of stocks. Authors emphasize three main features of statistical arbitrage strategies:

  • ‘trading signals are systematic, or rule-based, as opposed to driven by fundamentals’
  • ‘the trading book is market-neutral, in the sense that it has zero beta with the market’
  • ‘the mechanism for generating excess returns is statistical’

In case of pairs trading, if we have two stocks with similar characteristics, we can model them as follows:

Modeling returns of two similar stocks

Here P and denote the stock prices, alpha is drift (which is usually very small and can be neglected) and is a stationary mean reverting process. If is small, we buy stock 1 dollar of and sell Beta dollars of stock Q. If is large, then our positions are reversed.

Now assume that we want to trade a stock against a portfolio of its peers within a sector. We can decompose the stock returns into idiosyncratic and systematic components:

Decomposing stock returns

where F-terms represent returns of other stocks in the sector (or more generally other risk factors).

Our next assumption is that stochastic process follows an Ornstein-Uhlenbeck model:

Ornstein-Uhlenbeck process

The parameters of this process can be easily estimated. Parameter kappa indicates the speed of mean reversion. We are interested in stocks with large enough kappa so that they are expected to revert back quickly.

Now we need to determine which risk factors(or assets) to use for decomposing stock returns. In the paper three different approaches are tested:

  • using synthetic ETFs
  • using actual ETFs
  • using PCA-based portfolios

The main purpose of using synthetic ETFs was to test the strategy using data going back to 1996, when the amount of ETFs was very small. So I will skip this approach and start with the one using actual ETFs.


To start I will just use one ETF (BBH biotech ETF) and its constituents. The time period used is 11.10.2019 —26.11.2021. Here is what I will do:

  • Use 60-day rolling window to regress the returns of each stock against the returns of ETF.
  • Save the regression coefficient Beta for each stock on each day
  • Calculate the residuals and estimate the parameters of Ornstein-Uhlenbeck process
  • For stocks with sufficiently high speed of mean-reversion (kappa>252/30) calculate s-scores, which will be used as trading signals
  • Backtest the strategy using entry and exit thresholds presented in the paper

The process for estimating parameters of OU-process is described in the last section of the paper. The code for it is provided below.

This process is repeated for each trading day and each stock, using the last 60 data points. In the end we have two dataframes: one with the values of parameters Beta and the other one with s-scores. Below you can see the plot of the evolution of s-score of MRNA stock.

Evolution of the s-score of MRNA vs. BBH

After that we calculate our positions according to the following rules:

  • open short position if s_score>1.25
  • close short position if s_score<0.75
  • open long position if s_score<-1.25
  • close long position if s_score>-0.5

Long position means buying 1 dollar of stock and selling (shorting) Beta dollars of ETF. Short position means selling 1 dollar of stock and buying Beta dollars of ETF.

Now we need to determine how much capital to allocate to each position. Note that we can use capital from short positions to enter long positions. I am going to assume that the broker requires collateral of 50%, so we can use twice the amount of capital we have for trading. I will allocate the same amount of capital to long and short positions and divide that capital equally among positions, so that the sum of the weights of all current short positions is -1, and the sum of the weights of all current long positions is +1. For example if I have 2 short positions and 3 long positions, I will assigns weights of -0.5 to each short position and 0.333 to each long position.

Here I am only taking into account positions in stocks. To calculate the size of position in ETF we multiply Beta of each stock by its weight and sum the results.

In the paper authors claim that the fraction of capital allocated to ETF should be small due to the fact the long and short positions will cancel each others’ Betas. I checked it and indeed in our case the fraction of capital allocated to ETF is less than 0.1 on 80% of trading days.

Now we can calculate the returns of the strategy. Note that I divide the sum of weights by two to account for the fact that we are using twice the amount of capital we have.

Plot of cumulative returns

Above you can see the plot of the strategy returns compared with the returns of SPY and BBH ETFs. We see that although the returns of our strategy are more volatile, it outperforms both SPY and BBH. But we need to account for transaction costs. In the paper round-trip transaction cost is set to 0.1% which means 0.05% one-way cost. I will use the same value.

Plot of cumulative returns (with transaction costs)

Even with transaction costs our algorithm is still more profitable than SPY or BBH ETFs. Let’s look at some of the performance metrics.

Performance metrics

The Sharpe ratio of the strategy (with transaction costs) is 1.16 which is higher than the Sharpe ratio of SPY and BBH during the same period, but I believe it is still not very good for an algorithmic trading strategy.

Now I will use Monte Carlo simulations to check how many random strategies will achieve similar results. I’ve calculated that on average we have 4.9 long positions and 4.3 short positions every day. I will generate a random list of 5 long and 4 short positions for each trading day and calculate the return of such strategy.

After running 10000 simulations we see that only 1.62% has a total return bigger than ours, so we can be somewhat confident that the returns we get with the algorithm are not accidental.


Now I’m going to perform the same tests on a different ETF. I will use XLF financial sector ETF and its top 15 holdings. The time period will be the same.

Plot of cumulative returns

We can see above that the strategy outperforms SPY and XLF ETFs, but we need to account for transaction costs.

Plot of cumulative returns (with transaction costs)

With transaction costs our strategy still outperforms XLF, but not SPY. I think that one possible explanation is that the market here is a lot more efficient, compared to biotech stocks, since here we used stocks of some of the largest companies with huge market cap. Notice also how well the strategy performed during the recession of 2020, when the market probably was not the most rational and efficient. Now let’s look at performance metrics.

Performance metrics

The Sharpe ratio of our strategy is slightly bigger than that of the SPY ETF, and its maximum drawdown is a little smaller, but overall it doesn’t look very good. Let’s look at the results achieved by Monte Carlo simulated trades. Our strategy has on average 3 long and 3 short positions each day, I will use the same numbers for the simulations.

Here we have similar results as before — only 1.52% of simulations achieved a larger total return than our strategy. This means that it is unlikely that the returns achieved by the strategy were generated only by chance.


Some possible further improvements:

  • Use more risk factors for decomposing returns (e.g., add other ETFs from the same sector or some market indices)
  • Use several industries\sectors for diversification
  • Use PCA-based portfolios instead of ETFs
  • Estimate signals in ‘trading time’, taking daily transaction volume into account
  • Adjust threshold values of s-scores for opening and closing positions

I will implement and test some of these improvements in the following article.


Jupyter notebook with source code is available here.

If you have any questions, suggestions or corrections please post them in the comments. Thanks for reading.


References

[1] Statistical Arbitrage in the U.S. Equities Market (Avellaneda and Lee, 2008)

Leave a Reply

Your email address will not be published. Required fields are marked *