Portfolio Optimization. Random Matrix Approach to Estimating Covariance Matrix

January 26, 2024

Estimating covariance matrix is one of the biggest challenges in portfolio optimization. Sample covariance matrices tend to be very noisy and it gets substantially worse as the number of stocks in the dataset increases. In this article I will describe and implement a method for estimating covariance matrix based on Random Matrix Theory (RMT). It is based on the paper ‘A Random Matrix Approach to Cross-Correlations in Financial Data’ (Plerou et al. 2001).

Assume we want to estimate covariance matrix of 500 stocks. To do this we need to estimate 500*499/2 = 124750 individual covariances. A dataset that contains one year of daily prices for 500 stocks has about 500*250=125000 datapoints. Which means that we have a little more than one datapoint per parameter (covariance between a pair of stocks). As a result, sample covariance matrix is very noisy and contains a lot of random contributions. One possible solution is just using more data — twenty or thirty years instead of one. But the problem is that covariances are not stable and vary over time (it was demonstrated in my previous article), so older data is not really helpful. Another approach is using higher frequency data to get more datapoints, but even then the amount of data we can obtain is limited. In the paper authors propose a solution based on random matrix theory. Instead of using covariance matrix the paper focuses on correlation matrix, but correlation is just rescaled covariance and we can easily convert one to another.

The main idea is simple. We can compare properties of sample correlation matrix to a random correlation matrix, constructed from uncorrelated time series (such matrix is called Wishart matrix). If we find any deviations in the properties of the two matrices, then those deviations will contain “true” information about correlations. We can then split the contents of sample correlation matrix in two parts:

Part which matches the properties of random correlation matrix — “noise”.
Part that deviates from the properties of random correlation matrix — “information”.

To implement this idea in practice we will use Random Matrix Theory (RMT). RMT was developed for analyzing complex quantum systems (modeling the nuclei of heavy atoms). It is now used in many different fields — mathematical statistics, physics, number theory, neuroscience and others. And in this article we will try to apply it to financial data.

I am going to use prices of 490 stocks (constituents of SP500 at the time of writing) from 2018–01–01 to 2022–12–31 at daily resolution. Code for downloading and preparing the data is shown below.

Note that I’m using simple (arithmetic) returns instead of log returns, which are used in the paper. I will explain the reason for this later. I also don’t standardize returns. We are working with correlation matrix and it doesn’t change after standardization. Also remember that we are always working with correlation matrices. Sometimes I will say things like ‘eigenvalues of random sample’, but what I really mean is ‘eigenvalues of the correlation matrix of random sample’.

After preparing the data we can perform some of the tests from the paper. Let’s start with analyzing eigenvalues of correlation matrix. I assume that the reader is familiar with the notion of eigenvalues and eigenvectors. The main idea here is to compare the distribution of eigenvalues of the correlation matrix (of returns) with that of the random correlation matrix (of uncorrelated random variables). So let’s generate some uncorrelated random samples and save the eigenvalues of their correlation matrices. Code for doing it is shown below. I’m using the same mean and variance as real returns, but set the off-diagonal elements of covariance matrix to zero (to make them uncorrelated).

From the Random Matrix Theory we know the analytical distribution of eigenvalues of random correlation matrix. Its definition is shown below.

Bounds for lambda are calculated as follows.

Now let’s compare the distribution of eigenvalues of our random samples with analytical distribution defined above. First we need to calculate the bounds lambda+ and lambda-.

Now we can calculate analytical distribution and plot it along with the distribution of random sample.

Eigenvalues of the random sample vs. analytical distribution

As you can see on the plot above, we get almost a perfect fit.

Now we will repeat the same process with real returns. Lambda bounds stay the same since the dimensions of random samples and real returns are equal. One difference is that I’m creating two plots now — one for eigenvalues less than 3 and another for eigenvalues larger than 3.

Eigenvalues of real returns vs. analytical distribution (eigenvalues≤3)

We can also count the number of deviating eigenvalues.

As you can see above, more than half of the eigenvalues (490–15–173=302) are within the RMT bounds. We are mostly interested in largest eigenvalues. In the paper 2% of eigenvalues are larger than the upper bound lambda+. We have 15/490≈3% of eigenvalues larger than lambda+.

Next I will check the distribution of eigenvector components. Below I plot histograms of vector components for three eigenvectors corresponding to three largest eigenvalues (top row) and three eigenvectors corresponding to the eigenvalues within the RMT bounds (bottom row). According to RMT eigenvector components of random correlation matrix should be normally distributed (after rescaling). This analytical distribution is shown in orange.

Components of the top eigenvectors are clearly different from normal distribution. Components in the bottom row (corresponding to eigenvalues within the RMT bounds) are closer to their analytical (normal) distribution.

Below I plot component distribution for all eigenvectors within the RMT bounds. Even though it is not strictly normal (we have more components around zero), it is quite close to it.

Distribution of all components within RMT bounds

In the next test we make an interpretation of the largest eigenvalue and the corresponding eigenvector. Authors claim in Section VI B that it represents the influence of the market, which is common to all stocks. To confirm it we use (normalized) eigenvector components as weights for creating a portfolio of stocks. We then compare returns of that eigenportfolio with returns of the market (represented by SPY ETF). For comparison I also calculate returns of a portfolio corresponding to eigenvalue within RMT bounds.

First we make a scatterplot of standardized returns.

Returns of eigenportfolios vs. returns of SPY ETF

We can clearly see that returns of the eigenportfolio 1 (corresponding to the largest eigenvalue) is highly correlated with returns of SPY, whereas returns of eigenportfolio 100 (corresponding to the 100th largest eigenvalue) are not correlated with returns of SPY. Correlation coefficients are calculated below.

Let’s also plot cumulative returns of SPY and eigenportfolio 1.

As you can see above, cumulative returns almost exactly match each other. There are two possible reasons why the differ:

My stock universe does not exactly match SP500 index. It contains SP500 components at the time of writing, but there were many changes in SP500 index during 5-year period — some stocks were removed, other stocks were added.
We are using static weights for the whole period, but they are likely not stable and change over time.

Now let’s estimate the number of significant participants in eigenvectors using Inverse Participation Ratio (IPR). It is calculated as follows.

Below I plot IPR of random correlation matrix and correlation matrix of returns on a logarithmic scale.

Plot on the left shows IPR of a random correlation matrix. All eigenvalues are within the RMT bounds and all IPRs are around the same value. In a random matrix all eigenvector components have similar contributions, which leads to small IPR. Larger IPR means that eigenvector has fewer significant participants.

Plot on the right shows IPR of real correlation matrix. Note that vector corresponding to the largest eigenvalue has IPR similar to that of the random matrix. This makes sense — since it represents the market, all components are significant. This part is similar to the plot on fig. 11 in the paper (shown below). But we also have some differences. Note that in the paper IPRs of the eigenvectors within RMT bounds are all very close to each other and to IPRs of random matrix. On our plot we can see many large IPRs within RMT bounds. Another difference is that IPRs below RMT bound are all relatively large in the paper, but on our plot most of them are close to IPRs of random matrix. I don’t know why we have these differences and for now I will ignore them.

IPR as a function of eigenvalue (fig. 11)

Next we can try to apply all that to portfolio optimization. The idea here is to filter the correlation matrix by keeping only information corresponding to the largest eigenvalues above the RMT limit. Section VIII and Fig. 16 in the paper describe predicted and realized efficient frontiers constructed using simple (non-filtered) and filtered covariance matrices.

First let’s try to work with generated data. I’m using generated data to make sure that its mean and covariance matrix are stable and don’t change over time. Below I generate normally distributed random sample with mean and variance of real returns. Then I split the resulting data into train and test sets. First 4 years are used for training (finding optimal portfolios) and last year is used for testing (calculating realized returns of optimal portfolios).

In the beginning of the article I promised to explain why I’m using arithmetic returns instead of log returns which are used in the paper. The problem is that to perform portfolio optimization we need arithmetic returns. Equations (23) and (24) in the paper are correct only for arithmetic returns, not for log returns. Of course we can easily transform log returns to arithmetic returns, but the problem is that we also need to transform the filtered correlation matrix of log returns into correlation matrix of arithmetic returns, which is not very straighforward. There is a formula for doing it, but for simplicity I decided to use arithmetic returns.

I am not going to describe the whole portfolio optimization procedure here. You can find detailed explanation in my article ‘Introduction to Portfolio Optimization and Modern Portfolio Theory’.

Below you can see the plot of predicted and realized portfolio returns as a function of risk, calculated using sample covariance matrix.

Predicted vs. realized efficient frontier

Now let’s use what we learned to compute a filtered covariance matrix. First step is calculating RMT bounds lambda+ and lambda-.

Next we perform eigendecomposition of the correlation matrix. Recall that any real symmetric matrix A can be factored as:

Then we set all eigenvalues less than lambda+ to zero and assemble a new (filtered) matrix. To make it a valid correlation matrix we set all diagonal elements to 1. In the end we transform correlation matrix to covariance matrix, which is used for portfolio optimization.

Results are plotted below. Our plot looks similar to Fig. 16 in the paper. We can see that realized returns are slightly closer to predicted when we use filtered covariance matrix.

We can also quantify the results using one of the metrics, described in the paper. It allows us to estimate how close two curves are to each other. Smaller value means that the curves are closer.

Comparing predicted vs. realized efficient frontiers

Recall that I used generated data for this test. Now let’s try to repeat the same process with real returns. All the code is the same, so I’m just going to show the results.

Predicted vs. realized efficient frontier (real data)

Here we can clearly see that the curve obtained using filtered covariance matrix is closer to predicted curve than the one obtained using full sample covariance matrix. Using the same metric as above we get the following results.

Finally we are ready to create and backtest a trading strategy. I’m going to use the same data. Portfolio will be rebalanced weekly and I will use 252-day rolling window for estimating mean and covariance matrix.

First let’s calculate the performance metrics of equally weighted portfolio.

Recall that we need to specify target volatility for performing portfolio optimization with MPT. I’m going to use volatility of equally weighted portfolio for this. It is calculated below.

Volatility of equally weighted portfolio

Next we backtest a strategy that uses simple (not filtered) covariance matrix. Its performance metrics are shown below. We can see that this strategy outperforms equally weighted portfolio in terms of returns, but has a little lower Sharpe ratio and larger drawdown.

And now we implement and backtest a strategy using filtered covariance matrix. Everything is the same, we just need to perform eigendecomposition and calculate filtered covariance matrix, which is then used for mean-variance optimization.

Last strategy has the highest return and its drawdown duration is significantly smaller (40 weeks instead of 52\59 weeks for other strategies). Its Sharpe ratio is a little lower, but recall that we were optimizing for maximum return, not Sharpe ratio. Another benefit is that mean-variance optimization is faster with filtered covariance matrix — on my server backtesting the strategy with filtered covariance matrix was about 2.5 times faster than the strategy with full sample covariance.

We have demonstrated that using RMT to estimate covariance matrix of returns allows us to improve the performance of optimal portfolios. It helps us to distinguish real correlations from random correlations that are present even in correlation matrices of random uncorrelated samples.

Results of some of the tests I performed don’t match the results presented in the paper. Further research is needed to explain those differences and maybe introduce some changes to improve the accuracy of filtered covariance matrix.

Jupyter notebook with source code is available here.

If you have any questions, suggestions or corrections please post them in the comments. Thanks for reading.

Yuan Di prepared a Chinese adaptation of this article, which is available here.

References

[1] A Random Matrix Approach to Cross-Correlations in Financial Data (Plerou et al. 2001)

[2] Random matrix

[3] Introduction to Portfolio Optimization and Modern Portfolio Theory

[4] Portfolio Optimization with Weighted Mean and Covariance Estimators

[5] https://mathoverflow.net/questions/386493/distribution-of-eigenvectors-of-random-matrices-and-link-with-the-components-of

[6] https://quant.stackexchange.com/questions/70997/how-to-calculate-the-log-return-of-portfolio

[7] https://quant.stackexchange.com/questions/76688/calculating-portfolios-covariance-via-bilinearity-with-log-or-simple-returns