financialnoob.me

Blog about quantitative finance

Modeling marginal returns

Sometimes in finance it is necessary to model marginal returns of a given asset. Marginal in this case means that there are no serial correlations between returns in different time periods (returns are independent and identically distributed). A couple of examples:

  • We might believe that returns of a certain asset do not have any serial correlations, in that case we can increase the statistical significance of our backtests by running many simulations to assess strategy performance (each time drawing asset returns from the modeled distribution).
  • Even if we find that returns are serially correlated, we might want to test a scenario in which the correlations we found are spurious (again run many simulations and look at maximum drawdown).

In this article I will demonstrate several techniques which can be used for modeling marginal returns.


The simplest approach is just to sample returns from historical data, but in this case the sample size would be too small, especially for longer time periods. Another approach is to model probability distribution of the returns. We can then get simulated returns just by sampling from that distribution. Now the problem is to determine which distribution we should use. There are several ways to do it:

  • Looking at the data and trying to guess which one of well-known parametric distributions (or time series models) it follows, then applying goodness-of-fit tests to determine if our guess was correct.
  • Using Pearson system, which takes sample mean, variance, skewness and kurtosis as an input and models the data as one of the seven probability distribution families.
  • Using quantile-parameterized distribution which is parameterized by data.
  • Using metalog distribution which seems to be the most flexible of all techniques.

I was not able to find an implementation of quantile-parameterized distributions in Python, so I will only demonstrate three other methods. Let’s get started.


I will use daily returns of SPY ETF from 1993 to 2022 as my sample (7318 observations in total). Let’s have a look at the data. First we will look at some summary statistics.

Summary statistics

The mean return is very close to zero and the range of returns is approximately from -0.11 to 0.15. Now let’s look at the histogram and empirical probability density function.

Histogram and empirical PDF

At first glance the distribution looks symmetric and kind of similar to Gaussian distribution. Next we take a look at the autocorrelation and partial autocorrelation plots. It will help us understand if there are any serial correlations in the data.

Autocorrelation plot
Partial autocorrelation plot

Even though autocorrelation coefficients for some lags exceed confidence interval, they are still very low (less than 0.1 in absolute value), so the assumption that there are no serial correlations might not be wrong.

Now I will demonstrate several techniques to model marginal returns. I will assess the performance of each technique using QQ-plots and Kolmogorov-Smirnov goodness-of-fit test.


Parametric distributions

First I will try fitting several well-known probability distributions: normal distribution, lognormal distribution, and Student’s t-distribution.

The code above is for normal distribution. First we fit the given distribution on our data using SciPy fit function. It provides us with parameters that define a given distribution. Then I create two plots: a plot of empirical probability density function (PDF) along with model PDF and a QQ-plot to compare empirical and modeled distributions. These plots are demonstrated below.

PDFs and QQ-plot

Just by looking at the plots above we can see that normal distribution is a bad choice. The last step is to perform Kolmogorov-Smirnov goodness-of-fit test. I am drawing samples from a modeled distribution and then using 5000 samples to compare modeled and empirical distributions. Results of the test are provided below.

Kolmogorov-Smirnov test

P-value is almost zero, which means that there is zero percent chance that two distributions are equal.

Let’s try lognormal distribution. The code I used is very similar, so I won’t post it here. Below you can see generated plots and results of Kolmogorov-Smirnov test.

PDFs and QQ-plot
Kolmogorov-Smirnov test

The results we get here are almost identical to the previous ones, so lognormal distribution is also not a good choice.

How about Student’s t-distribution?

PDFs and QQ-plot
Kolmogorov-Smirnov test

Here we are getting very close. The plots of PDFs match each over very closely, but at QQ-plot we can see that the tails of the modeled distribution do not fit very well. Kolmogorov-Smirnov test gives a p-value of 0.12, which means that there is a 12% chance that tested samples come from the same distribution, which is not that bad. Let’s move on to other methods.


Pearson system

Pearson system allows us to model probability distribution based on several empirical moments (mean, variance, skewness and kurtosis). There are seven families of distributions that can be modeled using Pearson system and Student’s t-distribution used above is actually one of it. Unfortunately only Pearson type III distributions are implemented in SciPy, but it is a very bad fit for our data. You can see it on the plots above.

PDFs and QQ-plot
Kolmogorov-Smirnov test


Metalog distribution

Metalog distribution is one of the newest and most flexible distribution families. It was introduced in 2016 and provides many advatages compared to other methods. You can read about it here and here. It’s implementation for Python can be found here.

When using metalog distributions we need to specify the number of terms used to describe the shape of the distribution (it can range from 2 to 30). The larger the number of terms the more flexible the shape of the distribution can be. To determine the number of terms to use I will test a range of values from 2 to 13 and select the one which gives the largest p-value in Kolmogorov-Smirnov test. Code for doing that is provided below.

Running this code gives the following output:

The best results is achieved with 11 terms, giving a p-value of 0.85, which means that there is an 85% chance that the samples are drawn from the same distribution (empirical and modeled distributions are the same). This is the best result achieved so far. Let’s look at the plots.

PDFs and QQ-plot

As we can see above, PDFs of the distributions match each other very closely. QQ-plot also looks nice, but there are still some small discrepancies in the tails (which is to be expected considering that such returns are extremely rare and our sample size is not that big). Probably to be able to better model probabilities of extremely rare returns we would need to use Extreme Value Theory.


In this article I have demonstrated several methods of modeling marginal returns distributions. We have seen that out of all tested distributions only two have provided more or less acceptable results: Student’s t-distribution and metalog distribution. It seems that using metalog family of distributions is the best of all tested options.


Jupyter notebook with source code is available here.

If you have any questions, suggestions or corrections please post them in the comments. Thanks for reading.

Leave a Reply

Your email address will not be published. Required fields are marked *