financialnoob.me

Blog about quantitative finance

Pairs trading with wavelet transform

This article is based on paper ‘Pairs Trading with Wavelet Transform’ (Eroglu et al. 2022). Authors claim that using wavelet transform to remove noise from stocks price series leads to significant improvements in profitability of different pairs trading strategies. I am going to test several different strategies and see how using wavelet transform affects their performance.

I am not going to delve into theory behind wavelet transform. All we need to know is that it is a tool for data analysis, which is widely used in signal processing. We are going to use it to remove the noise from prices time series. Luckily there is a library PyWavelets that provides all the necessary functionality, so we don’t have to implement it from scratch.


As my stock universe I am going to use top 200 stocks (with largest market cap) from SP500 index. Training period is 30 months — from 2017–01–01 to 2019–06–30. Trading period is 6 months — from 2019–07–01 to 2019–12–31.

First let’s prepare our data.

In the last two rows I remove one datapoint to make the length of time series even. It is necessary for applying wavelet transform.

I will test 3 different strategies:

  • Strategy based on distance
  • Strategy based on cointegration
  • Strategy based on Granger-causality

Trading rules are the same for all strategies — open position when the spread is more than 2-SD away from zero, close position when it returns back to zero. The only differences are pair selection process and spread construction.


Let’s start with distance-based strategy. We need to use cumulative returns instead of prices here, so that calculated distances are comparable to each other.

To select pairs for trading we must calculate the sum of squared differences between cumulative returns of each pair of stocks. At the same time I also calculate standard deviation of the spread, which will be used later in trading.

Pairs sorted by SSD

Now we need to select some pairs with smallest SSD. I’m going to do it based on in-sample Sharpe ratio. I will select top 200 pairs from the dataframe above and backtest them on training data.

For each pair I calculate in-sample Sharpe ratio. Results are shown below. I’m showing only pairs with Sharpe ratio bigger than one.

In-sample Sharpe ratios

We have 103 potential pairs for trading. First let’s assume that we trade all of them. At the beginning of trading period we divide our capital equally among 103 pairs. Code for backtesting is shown below.

Let’s look at some statistics for individual pairs.

Statistics for individual pairs

We can see above that about 63% of pairs have positive returns. About 6% of pairs have no positions during trading period, which means that allocated to them capital was not used.

Now we can calculate total return and calculate performance metrics for this strategy.

Cumulative returns (simple distance)
Performance metrics

You can see that this simple strategy has surprisingly good performance. I think that we are just lucky with the selected stock universe and time period.

Now let’s try to modify this strategy a little bit and select only top 10 pairs for trading. I am going to select first 10 pairs that have open signals. That way we are going to use all the capital (equally divided among 10 pairs).

Pair selection is done as follows.

Now we calculate total return and performance metrics.

Cumulative returns (simple distance, top10 pairs)
Performance metrics

You can see above that this strategy is a little riskier (has smaller Sharpe ratio), but it gives larger returns. This is to be expected — less diversification (less traded pairs) leads to higher risk.


Now let’s test the same two strategies with wavelet transform. First we use wavelet transform to remove noise from training period prices. Then we use filtered prices to calculate cumulative returns, which are used for constructing spreads and pair selection.

PyWavelets library does not have sym22 wavelet which is used in the paper, but authors claim that sym12 wavelet gives similar results, so that’s what I’m using. We separate time series in two components — long-run component (which we are going to use) and short-run component (noise).

Code for performing a backtest is very similar, so I’m not going to repost it here. Pairs selected for trading are shown below.

Potential pairs

Now we have only 80 pairs with in-sample Sharpe ratios bigger than one.

Statistics for individual pairs

We get 61% of pairs with positive returns and 10% of pairs with no positions. This is similar to the previous results without wavelet transform. Now let’s look at cumulative returns and performance metrics of this strategy.

Cumulative returns (distance with WT)
Cumulative returns (distance with WT, top10 pairs)
Performance metrics

Using wavelet transform has little effect on performance of distance-based strategy. Its return and Sharpe ratio decrease a little bit, but all the metrics are close to that of the simple strategy without wavelet transform.


Now let’s test cointegration-based strategy. I am going to use two cointegration tests — Augmented Dickey-Fuller (ADF) test on residuals and Johansen cointegration test. Code for performing those tests is shown below. I assume that the test is passed if the null is rejected at 95% level. At the same time I also save regression coefficients alpha and beta and historical standard deviation of the spread, which will be used later.

Results are provided below. Only pairs that pass both tests are selected for trading. I am also removing pairs with negative coefficient beta (because we want to have opposite positions in each stock of a pair).

Potential pairs

We have 70 potential pairs. As before, let’s calculate their performance in-sample. Code for doing it is shown below. Note that I am using dollar-neutral positions — equal amount of capital is allocated to both stocks (long and short position) in a pair. Coefficient beta is used only for calculating the spread.

Now I am going to select only pairs with in-sample Sharpe ratio bigger than 1. And I am left with 42 potential pairs.

Selecting pairs with Sharpe>1

First let’s try to use them all for trading. Then I’m going to choose only 10 pairs, as we’ve done before. Code for performing a backtest is shown below.

Statistics for individual pairs

Only 30% of pairs have positive return and 19% of pairs have no positions during trading period. This doesn’t look very good. Cumulative returns and performance metrics are shown below. You can see that simple cointegration-based strategy doesn’t perform well here.

Cumulative returns (simple cointegration)
Cumulative returns (simple cointegration, top10 pairs)
Performance metrics

Now let’s see how filtering time series with wavelet transform affect the performance of cointegration-based strategy. We already have a dataframe with filtered prices, now we are using them to test each pair for cointegration.

Applying the same conditions as before, we get the following results.

Potential pairs

Now we have 3055 potential pairs for trading (compared to 70 we had before). So using wavelet transform seems to help us find more cointegrating pairs.

Now we calculate in-sample Sharpe ratios and select pairs with Sharpe bigger than 1.

Selecting pairs with Share>1

We are left with 1569 pairs. As usual, we first try trading in all pairs and then limit the number of pairs to 10.

Statistics for individual pairs

51% of pairs have positive returns. This is significantly better than 30% we got without wavelet transform. 13% of pairs are not traded. Cumulative returns and performance metrics are provided below.

Cumulative returns (cointegration with WT)
Cumulative returns (cointegration with WT, top10 pairs)
Performance metrics

Here we get a significant improvement in performance. Strategy trading in top 10 pairs has the biggest total return of all strategies tested so far. So in this case removing the noise with wavelet transform turns out to be very useful.


Last strategy I’m going to test is based on Granger causality. You can read more about Granger causality in pairs trading in my previous article.

Basically it is the same as cointegration-based strategy, but I’m going to further limit tradable pairs and use only pairs where each stock Granger-causes its partner. Code for performing Granger-causality tests is provided below. Note that I am testing only those pairs that passed both cointegration tests.

Then we are selecting only pairs that Granger-cause each other and we have 25 potential pairs.

Potential pairs

After applying the limit on in-sample Sharpe ratio we are left with only 14 pairs.

Selecting pairs with Sharpe>1

Code for backtest is exactly the same as for cointegration-based strategy, so I’m not posting it again. Let’s look at some statistics.

Statistics for individual pairs

Only 28% of pairs have positive returns, which is not good. Since we have only 14 tradable pairs here I’m not going to test this strategy with 10 pairs, as I did before.

Performance metrics

This strategy has the worst performance among all the strategies tested so far.


Now we perform the same tests using filtered data. Recall that we got significantly more cointegrated pairs with filtered data, therefore we expect more pairs with stocks that Granger-cause each other. You can see below that we have 2370 such pairs.

Performing in-sample backtest and selecting pairs with Sharpe ratio bigger than 1 we are left with 1232 pairs. Again I’m going to test two approaches — trading all those pairs and trading only 10 pairs.

Statistics for individual pairs

Almost 53% of pairs have positive returns, which means that we should have an edge here.

Cumulative returns (Granger-causality with WT)
Cumulative returns (Granger-causality with WT, top10 pairs)
Performance metrics

On the screenshot above you can see performance metrics of all the tested strategies. Strategies based on distance (even without wavelet transform) have biggest Sharpe ratios, which is probably the most surprising finding in this article. For strategies based on cointegration and Granger-causality using wavelet transform seems to be very useful. It allows us to find more cointegrating relationships and significantly improves performance of such strategies. Strategy based on Granger-causality (trading in top10 pairs) has the biggest annual return — 17%.


Ideas for further research:

  • Try different wavelet functions.
  • Optimize z-scores for opening and closing positions.
  • Allow non dollar-neutral positions (different amount of capital allocated to long and short legs of the trade).
  • Test how other strategies perform an raw vs. filtered data.

Jupyter notebook with source code is available here.

If you have any questions, suggestions or corrections please post them in the comments. Thanks for reading.


References

[1] Pairs trading with wavelet transform

[2] PyWavelets — Wavelet Transforms in Python

[3] Pairs Trading: Performance of a Relative Value Arbitrage Rule (Gatev et al. 2006)

Leave a Reply

Your email address will not be published. Required fields are marked *