import pandas as pd
import numpy as np
import sqlite3
import statsmodels.formula.api as smf
from itertools import product, starmap
from scipy.optimize import minimize
Parametric Portfolio Policies
You are reading Tidy Finance with Python. You can find the equivalent chapter for the sibling Tidy Finance with R here.
In this chapter, we apply different portfolio performance measures to evaluate and compare portfolio allocation strategies. For this purpose, we introduce a direct way to estimate optimal portfolio weights for large-scale cross-sectional applications. More precisely, the approach of Brandt, Santa-Clara, and Valkanov (2009) proposes to parametrize the optimal portfolio weights as a function of stock characteristics instead of estimating the stock’s expected return, variance, and covariances with other stocks in a prior step. We choose weights as a function of characteristics that maximize the expected utility of the investor. This approach is feasible for large portfolio dimensions (such as the entire CRSP universe) and has been proposed by Brandt, Santa-Clara, and Valkanov (2009). See the review paper by Brandt (2010) for an excellent treatment of related portfolio choice methods.
The current chapter relies on the following set of Python packages:
Compared to previous chapters, we introduce the scipy.optimize
module from the scipy
(Virtanen et al. 2020) for solving optimization problems.
Data Preparation
To get started, we load the monthly CRSP file, which forms our investment universe. We load the data from our SQLite database introduced in Accessing and Managing Financial Data and WRDS, CRSP, and Compustat.
= sqlite3.connect(database="data/tidy_finance_python.sqlite")
tidy_finance
= (pd.read_sql_query(
crsp_monthly =("SELECT permno, date, ret_excess, mktcap, mktcap_lag "
sql"FROM crsp_monthly"),
=tidy_finance,
con={"date"})
parse_dates
.dropna() )
To evaluate the performance of portfolios, we further use monthly market returns as a benchmark to compute CAPM alphas.
= pd.read_sql_query(
factors_ff_monthly ="SELECT date, mkt_excess FROM factors_ff3_monthly",
sql=tidy_finance,
con={"date"}
parse_dates )
Next, we retrieve some stock characteristics that have been shown to have an effect on the expected returns or expected variances (or even higher moments) of the return distribution. In particular, we record the lagged one-year return momentum (momentum_lag
), defined as the compounded return between months \(t-13\) and \(t-2\) for each firm. In finance, momentum is the empirically observed tendency for rising asset prices to rise further and falling prices to keep falling (Jegadeesh and Titman 1993). The second characteristic is the firm’s market equity (size_lag
), defined as the log of the price per share times the number of shares outstanding (Banz 1981). To construct the correct lagged values, we use the approach introduced in WRDS, CRSP, and Compustat.
= (crsp_monthly
crsp_monthly_lags =lambda x: x["date"]+pd.DateOffset(months=13))
.assign(date"permno", "date", "mktcap"])
.get([
)
= (crsp_monthly
crsp_monthly
.merge(crsp_monthly_lags, ="inner", on=["permno", "date"], suffixes=["", "_13"])
how
)
= (crsp_monthly
data_portfolios
.assign(=lambda x: x["mktcap_lag"]/x["mktcap_13"],
momentum_lag=lambda x: np.log(x["mktcap_lag"])
size_lag
)=["momentum_lag", "size_lag"])
.dropna(subset )
Parametric Portfolio Policies
The basic idea of parametric portfolio weights is as follows. Suppose that at each date \(t\), we have \(N_t\) stocks in the investment universe, where each stock \(i\) has a return of \(r_{i, t+1}\) and is associated with a vector of firm characteristics \(x_{i, t}\) such as time-series momentum or the market capitalization. The investor’s problem is to choose portfolio weights \(w_{i,t}\) to maximize the expected utility of the portfolio return: \[\begin{aligned} \max_{\omega} E_t\left(u(r_{p, t+1})\right) = E_t\left[u\left(\sum\limits_{i=1}^{N_t}\omega_{i,t}\cdot r_{i,t+1}\right)\right] \end{aligned} \tag{1}\] where \(u(\cdot)\) denotes the utility function.
Where do the stock characteristics show up? We parameterize the optimal portfolio weights as a function of the stock characteristic \(x_{i,t}\) with the following linear specification for the portfolio weights: \[\omega_{i,t} = \bar{\omega}_{i,t} + \frac{1}{N_t}\theta'\hat{x}_{i,t}, \tag{2}\] where \(\bar{\omega}_{i,t}\) is a stock’s weight in a benchmark portfolio (we use the value-weighted or naive portfolio in the application below), \(\theta\) is a vector of coefficients which we are going to estimate, and \(\hat{x}_{i,t}\) are the characteristics of stock \(i\), cross-sectionally standardized to have zero mean and unit standard deviation.
Intuitively, the portfolio strategy is a form of active portfolio management relative to a performance benchmark. Deviations from the benchmark portfolio are derived from the individual stock characteristics. Note that by construction, the weights sum up to one as \(\sum_{i=1}^{N_t}\hat{x}_{i,t} = 0\) due to the standardization. Moreover, the coefficients are constant across assets and over time. The implicit assumption is that the characteristics fully capture all aspects of the joint distribution of returns that are relevant for forming optimal portfolios.
We first implement cross-sectional standardization for the entire CRSP universe. We also keep track of (lagged) relative market capitalization relative_mktcap
, which will represent the value-weighted benchmark portfolio, while n
denotes the number of traded assets \(N_t\), which we use to construct the naive portfolio benchmark.
= (data_portfolios
data_portfolios "date")
.groupby(apply(lambda x: x.assign(
.=x["mktcap_lag"]/x["mktcap_lag"].sum()
relative_mktcap
)
)=True)
.reset_index(drop"date")
.set_index(="date")
.groupby(level
.transform(lambda x: (x-x.mean())/x.std() if x.name.endswith("lag") else x
)
.reset_index()"mktcap_lag"], axis=1)
.drop([ )
Computing Portfolio Weights
Next, we move on to identify optimal choices of \(\theta\). We rewrite the optimization problem together with the weight parametrization and can then estimate \(\theta\) to maximize the objective function based on our sample \[\begin{aligned} E_t\left(u(r_{p, t+1})\right) = \frac{1}{T}\sum\limits_{t=0}^{T-1}u\left(\sum\limits_{i=1}^{N_t}\left(\bar{\omega}_{i,t} + \frac{1}{N_t}\theta'\hat{x}_{i,t}\right)r_{i,t+1}\right). \end{aligned} \tag{3}\] The allocation strategy is straightforward because the number of parameters to estimate is small. Instead of a tedious specification of the \(N_t\) dimensional vector of expected returns and the \(N_t(N_t+1)/2\) free elements of the covariance matrix, all we need to focus on in our application is the vector \(\theta\). \(\theta\) contains only two elements in our application: the relative deviation from the benchmark due to size and momentum.
To get a feeling for the performance of such an allocation strategy, we start with an arbitrary initial vector \(\theta_0\). The next step is to choose \(\theta\) optimally to maximize the objective function. We automatically detect the number of parameters by counting the number of columns with lagged values. Note that the value for \(\theta\) of 1.5 is an arbitrary choice.
= [i for i in data_portfolios.columns if "lag" in i]
lag_columns = len(lag_columns)
n_parameters = pd.DataFrame({"theta": [1.5]*n_parameters}, index=lag_columns) theta
The function compute_portfolio_weights()
below computes the portfolio weights \(\bar{\omega}_{i,t} + \frac{1}{N_t}\theta'\hat{x}_{i,t}\) according to our parametrization for a given value \(\theta_0\). Everything happens within a single pipeline. Hence, we provide a short walk-through.
We first compute characteristic_tilt
, the tilting values \(\frac{1}{N_t}\theta'\hat{x}_{i, t}\) which resemble the deviation from the benchmark portfolio. Next, we compute the benchmark portfolio weight_benchmark
, which can be any reasonable set of portfolio weights. In our case, we choose either the value or equal-weighted allocation. weight_tilt
completes the picture and contains the final portfolio weights weight_tilt = weight_benchmark + characteristic_tilt
, which deviate from the benchmark portfolio depending on the stock characteristics.
The final few lines go a bit further and implement a simple version of a no-short sale constraint. While it is generally not straightforward to ensure portfolio weight constraints via parameterization, we simply normalize the portfolio weights such that they are enforced to be positive. Finally, we make sure that the normalized weights sum up to one again: \[\omega_{i,t}^+ = \frac{\max(0, \omega_{i,t})}{\sum_{j=1}^{N_t}\max(0, \omega_{i,t})}. \tag{4}\]
The following function computes the optimal portfolio weights in the way just described.
def compute_portfolio_weights(theta,
data,=True,
value_weighting=True):
allow_short_selling"""Compute portfolio weights for different strategies."""
= [i for i in data.columns if "lag" in i]
lag_columns = pd.DataFrame(theta, index=lag_columns)
theta
= (data
data "date")
.groupby(apply(lambda x: x.assign(
.=x[theta.index] @ theta / x.shape[0]
characteristic_tilt
)
)=True)
.reset_index(drop
.assign(=lambda x:
weight_benchmark"relative_mktcap"] if value_weighting else 1/x.shape[0],
x[=lambda x:
weight_tilt"weight_benchmark"] + x["characteristic_tilt"]
x[
)=["characteristic_tilt"])
.drop(columns
)
if not allow_short_selling:
= (data
data =lambda x: np.maximum(0, x["weight_tilt"]))
.assign(weight_tilt
)
= (data
data "date")
.groupby(apply(lambda x: x.assign(
.=lambda x: x["weight_tilt"]/x["weight_tilt"].sum()))
weight_tilt=True)
.reset_index(drop
)
return data
In the next step, we compute the portfolio weights for the arbitrary vector \(\theta_0\). In the example below, we use the value-weighted portfolio as a benchmark and allow negative portfolio weights.
= compute_portfolio_weights(
weights_crsp
theta,
data_portfolios,=True,
value_weighting=True
allow_short_selling )
Portfolio Performance
Are the computed weights optimal in any way? Most likely not, as we picked \(\theta_0\) arbitrarily. To evaluate the performance of an allocation strategy, one can think of many different approaches. In their original paper, Brandt, Santa-Clara, and Valkanov (2009) focus on a simple evaluation of the hypothetical utility of an agent equipped with a power utility function \[u_\gamma(r) = \frac{(1 + r)^{(1-\gamma)}}{1-\gamma}, \tag{5}\] where \(\gamma\) is the risk aversion factor.
def power_utility(r, gamma=5):
"""Calculate power utility for given risk aversion."""
= ((1+r)**(1-gamma))/(1-gamma)
utility
return utility
We want to note that Gehrig, Sögner, and Westerkamp (2020) warn that, in the leading case of constant relative risk aversion (CRRA), strong assumptions on the properties of the returns, the variables used to implement the parametric portfolio policy, and the parameter space are necessary to obtain a well-defined optimization problem.
No doubt, there are many other ways to evaluate a portfolio. The function below provides a summary of all kinds of interesting measures that can be considered relevant. Do we need all these evaluation measures? It depends: The original paper by Brandt, Santa-Clara, and Valkanov (2009) only cares about the expected utility to choose \(\theta\). However, if you want to choose optimal values that achieve the highest performance while putting some constraints on your portfolio weights, it is helpful to have everything in one function.
def evaluate_portfolio(weights_data,
=True,
full_evaluation=True,
capm_evaluation=12):
length_year"""Calculate portfolio evaluation measures."""
= (weights_data
evaluation "date")
.groupby(apply(lambda x: pd.Series(
."ret_excess", "ret_excess"]],
np.average(x[[=x[["weight_tilt", "weight_benchmark"]],
weights=0),
axis"return_tilt", "return_benchmark"])
[
)
.reset_index()="date", var_name="model",
.melt(id_vars=["return_tilt", "return_benchmark"],
value_vars="portfolio_return")
value_name=lambda x: x["model"].str.replace("return_", ""))
.assign(model
)
= (evaluation
evaluation_stats "model")["portfolio_return"]
.groupby(
.aggregate(["Expected utility", lambda x: np.mean(power_utility(x))),
("Average return", lambda x: np.mean(length_year*x)*100),
("SD return", lambda x: np.std(x)*np.sqrt(length_year)*100),
("Sharpe ratio", lambda x: (np.mean(x)/np.std(x)*
(
np.sqrt(length_year)))
])
)
if capm_evaluation:
= (evaluation
evaluation_capm ="left", on="date")
.merge(factors_ff_monthly, how"model")
.groupby(apply(lambda x:
.="portfolio_return ~ 1 + mkt_excess", data=x)
smf.ols(formula
.fit().params
)={"const": "CAPM alpha",
.rename(columns"mkt_excess": "Market beta"})
)= (evaluation_stats
evaluation_stats ="left", on="model")
.merge(evaluation_capm, how
)
if full_evaluation:
= (weights_data
evaluation_weights ="date", var_name="model",
.melt(id_vars=["weight_benchmark", "weight_tilt"],
value_vars="weight")
value_name"model", "date"])["weight"]
.groupby([
.aggregate(["Mean abs. weight", lambda x: np.mean(abs(x))),
("Max. weight", lambda x: max(x)),
("Min. weight", lambda x: min(x)),
("Avg. sum of neg. weights", lambda x: -np.sum(x[x < 0])),
("Avg. share of neg. weights", lambda x: np.mean(x < 0))
(
])
.reset_index()=["date"])
.drop(columns"model"])
.groupby([lambda x: np.average(x)*100)
.aggregate(
.reset_index()=lambda x: x["model"].str.replace("weight_", ""))
.assign(model
)
= (evaluation_stats
evaluation_stats ="left", on="model")
.merge(evaluation_weights, how"model")
.set_index(
)
= (evaluation_stats
evaluation_stats
.transpose()=None)
.rename_axis(columns
)
return evaluation_stats
Let us take a look at the different portfolio strategies and evaluation measures.
round(2) evaluate_portfolio(weights_crsp).
benchmark | tilt | |
---|---|---|
Expected utility | -0.25 | -0.26 |
Average return | 6.87 | 0.54 |
SD return | 15.46 | 21.18 |
Sharpe ratio | 0.44 | 0.03 |
Intercept | 0.00 | -0.00 |
Market beta | 0.99 | 0.94 |
Mean abs. weight | 0.03 | 0.08 |
Max. weight | 4.09 | 4.25 |
Min. weight | 0.00 | -0.17 |
Avg. sum of neg. weights | 0.00 | 78.13 |
Avg. share of neg. weights | 0.00 | 49.06 |
The value-weighted portfolio delivers an annualized return of more than six percent and clearly outperforms the tilted portfolio, irrespective of whether we evaluate expected utility, the Sharpe ratio, or the CAPM alpha. We can conclude the market beta is close to one for both strategies (naturally almost identically one for the value-weighted benchmark portfolio). When it comes to the distribution of the portfolio weights, we see that the benchmark portfolio weight takes less extreme positions (lower average absolute weights and lower maximum weight). By definition, the value-weighted benchmark does not take any negative positions, while the tilted portfolio also takes short positions.
Optimal Parameter Choice
Next, we move to a choice of \(\theta\) that actually aims to improve some (or all) of the performance measures. We first define the helper function compute_objective_function()
, which we then pass to an optimizer.
def objective_function(theta,
data,="Expected utility",
objective_measure=True,
value_weighting=True):
allow_short_selling"""Define portfolio objective function."""
= compute_portfolio_weights(
processed_data
theta, data, value_weighting, allow_short_selling
)
= evaluate_portfolio(
objective_function
processed_data, =False,
capm_evaluation=False
full_evaluation
)
= -objective_function.loc[objective_measure, "tilt"]
objective_function
return objective_function
You may wonder why we return the negative value of the objective function. This is simply due to the common convention for optimization procedures to search for minima as a default. By minimizing the negative value of the objective function, we get the maximum value as a result. In its most basic form, Python optimization uses the function minimize()
. As main inputs, the function requires an initial guess of the parameters and the objective function to minimize. Now, we are fully equipped to compute the optimal values of \(\hat\theta\), which maximize the hypothetical expected utility of the investor.
= minimize(
optimal_theta =objective_function,
fun=[1.5]*n_parameters,
x0=(data_portfolios, "Expected utility", True, True),
args="Nelder-Mead",
method=1e-2
tol
)
(pd.DataFrame(
optimal_theta.x,=["Optimal theta"],
columns=["momentum_lag", "size_lag"]).T.round(3)
index )
momentum_lag | size_lag | |
---|---|---|
Optimal theta | 0.301 | -1.705 |
The resulting values of \(\hat\theta\) are easy to interpret: intuitively, expected utility increases by tilting weights from the value-weighted portfolio toward smaller stocks (negative coefficient for size) and toward past winners (positive value for momentum). Both findings are in line with the well-documented size effect (Banz 1981) and the momentum anomaly (Jegadeesh and Titman 1993).
More Model Specifications
How does the portfolio perform for different model specifications? For this purpose, we compute the performance of a number of different modeling choices based on the entire CRSP sample. The next code chunk performs all the heavy lifting.
def evaluate_optimal_performance(data,
="Expected utility",
objective_measure=True,
value_weighting=True):
allow_short_selling"""Calculate optimal portfolio performance."""
= minimize(
optimal_theta =objective_function,
fun=[1.5]*n_parameters,
x0=(data, objective_measure, value_weighting, allow_short_selling),
args="Nelder-Mead",
method=10e-2
tol
).x
= compute_portfolio_weights(
processed_data
optimal_theta, data,
value_weighting, allow_short_selling
)
= evaluate_portfolio(processed_data)
portfolio_evaluation
= "VW" if value_weighting else "EW"
weight_text = "" if allow_short_selling else " (no s.)"
short_text
= {
strategy_name_dict "benchmark": weight_text,
"tilt": f"{weight_text} Optimal{short_text}"
}
= [
portfolio_evaluation.columns for i in portfolio_evaluation.columns
strategy_name_dict[i]
]
return(portfolio_evaluation)
Finally, we can compare the results. The table below shows summary statistics for all possible combinations: equal- or value-weighted benchmark portfolio, with or without short-selling constraints, and tilted toward maximizing expected utility.
= [data_portfolios]
data = [True, False]
value_weighting = [True, False]
allow_short_selling = ["Expected utility"]
objective_measure
= product(
permutations
data, objective_measure,
value_weighting, allow_short_selling
)= list(starmap(
results
evaluate_optimal_performance,
permutations
))= (pd.concat(results, axis=1)
performance_table round(3)
.T.drop_duplicates().T.
)"EW", "VW"]) performance_table.get([
EW | VW | |
---|---|---|
Expected utility | -0.251 | -0.250 |
Average return | 10.011 | 6.867 |
SD return | 20.472 | 15.461 |
Sharpe ratio | 0.489 | 0.444 |
Intercept | 0.002 | 0.000 |
Market beta | 1.130 | 0.994 |
Mean abs. weight | 0.000 | 0.030 |
Max. weight | 0.000 | 4.091 |
Min. weight | 0.000 | 0.000 |
Avg. sum of neg. weights | 0.000 | 0.000 |
Avg. share of neg. weights | 0.000 | 0.000 |
"EW Optimal", "VW Optimal"]) performance_table.get([
EW Optimal | VW Optimal | |
---|---|---|
Expected utility | -4.626 | -0.261 |
Average return | -4887.150 | 0.537 |
SD return | 14403.879 | 21.176 |
Sharpe ratio | -0.339 | 0.025 |
Intercept | -3.613 | -0.005 |
Market beta | -81.770 | 0.943 |
Mean abs. weight | 60.116 | 0.077 |
Max. weight | 1009.255 | 4.254 |
Min. weight | -213.240 | -0.173 |
Avg. sum of neg. weights | 75809.033 | 78.130 |
Avg. share of neg. weights | 51.745 | 49.064 |
"EW Optimal (no s.)", "VW Optimal (no s.)"]) performance_table.get([
EW Optimal (no s.) | VW Optimal (no s.) | |
---|---|---|
Expected utility | -0.252 | -0.250 |
Average return | 7.973 | 7.415 |
SD return | 19.149 | 16.706 |
Sharpe ratio | 0.416 | 0.444 |
Intercept | 0.000 | 0.000 |
Market beta | 1.136 | 1.055 |
Mean abs. weight | 0.030 | 0.030 |
Max. weight | 1.306 | 2.351 |
Min. weight | 0.000 | 0.000 |
Avg. sum of neg. weights | 0.000 | 0.000 |
Avg. share of neg. weights | 0.000 | 0.000 |
The results indicate that the average annualized Sharpe ratio of the equal-weighted portfolio exceeds the Sharpe ratio of the value-weighted benchmark portfolio. Nevertheless, starting with the weighted value portfolio as a benchmark and tilting optimally with respect to momentum and small stocks yields the highest Sharpe ratio across all specifications. Finally, imposing no short-sale constraints does not improve the performance of the portfolios in our application.
Exercises
- How do the estimated parameters \(\hat\theta\) and the portfolio performance change if your objective is to maximize the Sharpe ratio instead of the hypothetical expected utility?
- The code above is very flexible in the sense that you can easily add new firm characteristics. Construct a new characteristic of your choice and evaluate the corresponding coefficient \(\hat\theta_i\).
- Tweak the function
optimal_theta()
such that you can impose additional performance constraints in order to determine \(\hat\theta\), which maximizes expected utility under the constraint that the market beta is below 1. - Does the portfolio performance resemble a realistic out-of-sample backtesting procedure? Verify the robustness of the results by first estimating \(\hat\theta\) based on past data only. Then, use more recent periods to evaluate the actual portfolio performance.
- By formulating the portfolio problem as a statistical estimation problem, you can easily obtain standard errors for the coefficients of the weight function. Brandt, Santa-Clara, and Valkanov (2009) provide the relevant derivations in their paper in Equation (10). Implement a small function that computes standard errors for \(\hat\theta\).