Suppose you've written an algorithm that's developed a signal (for example, a Pipeline Factor) that's predictive of forward returns over some time horizon. You might think that the hard part is over, but you're still left with the daunting task of translating your signal into an algorithm that can turn a profit while also managing risk exposures.

Traditionally in quantitative finance, the solution to the problem of maximizing returns while constraining risk has been to employ some form of Portfolio Optimization, but performing sophisticated optimizations is challenging on today's Quantopian.

Python libraries like scipy.optimize, CVXOPT, and CVXPY (all available on Quantopian today) provide generic tools for solving optimization problems. These libraries are powerful and flexible, but it takes significant expertise to convert the data structures available on Quantopian into the specific formats understood by these libraries.

Algorithm authors who want to perform even simple optimizations spend much of their time to figuring out how to encode conceptually simple ideas like "constrain gross leverage" into complicated matrices that are hard to interpret and hard to debug when something goes wrong.

The open-source Python ecosystem already provides excellent implementations of the algorithms needed to implement a quality portfolio optimization library. What's missing is an interface that conveniently maps domain concepts from finance to the low-level primitives exposed by existing libraries.

Today we're announcing ** quantopian.experimental.optimize**, a suite of new tools designed to make portfolio optimization on Quantopian more accessible to algorithm authors. As the name suggests, all the functions and classes currently available are

**NOTE: The optimization API is only available in notebooks for this release.** Work is currently ongoing to make optimization available in the backtester. See the **Next Steps** section at the bottom of this notebook for more details.

Many algorithms written on Quantopian reach a point where they need to solve an **optimization problem** of the following form:

Given a list of stocks, choose a vector,

$$w$$of weights in each stock that maximizes (or minimizes) an **objective** function,

subject to a set of inequality **constraints**,

An algorithm builds a model that predicts expected returns for a list of stocks. The algorithm wants to allocate a limited amount of capital to those stocks in a way that gives it the greatest possible expected return without placing too big a bet on any single stock.

We can frame this as an optimization problem as follows:

Given an expected returns vector $r = [r_0, r_1, \dots, r_n]$, a maximum single-stock weight $w_{max}$, and a maximum total weight $W_{max}$, find a portfolio weight vector $w = [w_0, w_1, \dots, w_n]$ that solves the following system:

\begin{align} \text{maximize}&& \\ &&r \cdot w \\ \text{subject to}&& \\ &&w_i &\leq w_{max} && 0 \leq i \leq n\\ && \sum_{i=0}^{n}{|w_i|} &\leq W_{max} \end{align}A more sophisticated algorithm might introduce a more complex objective (e.g. "maximize return with a penalty applied for expected volatility"), or it might introduce more complex constraints (e.g. limit exposure to particular market sectors).

The `optimize`

module has three major components in this release:

`calculate_optimal_portfolio`

, a top-level entrypoint.`Objective`

classes, representing functions to be minimized or maximized by the optimizer.`Constraint`

classes, representing constraints to be enforced by the optimizer.

Lists of the currently available objectives and constraints can be found under `optimize.objectives`

and `optimize.constraints`

, respectively.

To run a portfolio optimization, you call `calculate_optimal_portfolio`

and provide three values:

- An
`Objective`

to optimize. - A list of
`Constraints`

to enforce. - A pandas Series containing weights for the current portfolio. The index of the current portfolio series defines the assets that are allowed in the target portfolio.

In [1]:

```
import quantopian.experimental.optimize as opt
print opt.calculate_optimal_portfolio.__doc__
```

In [2]:

```
opt.objectives
```

Out[2]:

In [3]:

```
opt.constraints
```

Out[3]:

Let's look at how we can use the `optimize`

API to solve the example outlined above.

In [4]:

```
import numpy as np
import pandas as pd
def draw_asset_barplot(weights, title, plot_kwargs=None):
"""Draw a bar plot from a Series with Asset labels."""
return draw_barplot(
weights,
title,
xticks=[asset.symbol for asset in weights.index],
plot_kwargs=plot_kwargs,
)
def draw_barplot(weights, title, xticks=None, plot_kwargs=None):
"""Draw a bar plot from a pd.Series."""
# Draw the plot. Forward plot_kwargs (if provided) as keywords to ``plot``.
axes = weights.plot(kind='bar', rot=0, fontsize=12, **(plot_kwargs or {}))
axes.grid(False, axis='x')
# Set a title.
axes.set_title(title, {'fontsize': 14})
# Set xtick labels, if provided.
if xticks is not None:
axes.set_xticklabels(xticks)
return axes
```

In [5]:

```
# Choose a small universe to make the graphs manageable.
universe = symbols(
['AAPL', 'MSFT', 'TWTR', 'BP', 'XOM', 'MCD', 'QSR'],
symbol_reference_date='2016-10-07',
)
def fancy_returns_model(assets):
"""Not actually fancy."""
rng = np.random.RandomState(5)
return pd.Series(index=assets, data=rng.randn(len(assets)))
empty_portfolio = pd.Series(index=universe, data=0.0)
expected_returns = fancy_returns_model(universe)
```

In [6]:

```
draw_asset_barplot(expected_returns, 'Expected Returns');
```

** opt.MaximizeAlpha** as our objective function.

In [7]:

```
print opt.MaximizeAlpha.__doc__
```

`MaximizeAlpha`

warns us that we need to put a constraint on the gross leverage of our portfolio or else the optimizer will try to allocate an unbounded amount of capital, so we'll add a ** MaxGrossLeverage** constraint as well.

In [8]:

```
print opt.MaxGrossLeverage.__doc__
```

In [9]:

```
def optimal_portfolio_constrained_leverage_only():
"""Calculate the optimal portfolio if we're only constrained by gross leverage."""
# Our objective function.
objective = opt.MaximizeAlpha(expected_returns)
# A list containing our constraints.
constraints = [opt.MaxGrossLeverage(1.0)]
return opt.calculate_optimal_portfolio(objective, constraints, empty_portfolio)
```

In [10]:

```
constrained_leverage_pf = optimal_portfolio_constrained_leverage_only()
draw_asset_barplot(constrained_leverage_pf, 'Optimal Portfolio (Constrained Leverage Only)');
```

With no constraints other than maximizing expected return, the optimizer decided to allocate all of our capital to a single long position in `TWTR`

. While this might be an acceptable portfolio for someone of a gambling disposition, we'd probably rather not put all our eggs in one basket.

We can force the optimizer to diversify our portfolio by adding a ** PositionConcentration** constraint:

In [11]:

```
print opt.PositionConcentration.__doc__
```

In [12]:

```
# Allow 15% of our liquidation value to be in any single short position.
# Allow 30% of our liquidation value to be in any single long position.
# Try changing these parameters to see how they change the results!
MIN_POSITION_WEIGHT = -0.15
MAX_POSITION_WEIGHT = 0.30
def optimal_portfolio_constrained_leverage_and_concentration():
"""Calculate the optimal portfolio if we're constrained by leverage and concentration."""
objective = opt.MaximizeAlpha(expected_returns)
constraints = [
opt.MaxGrossLeverage(1.0),
opt.PositionConcentration.from_constants(min=MIN_POSITION_WEIGHT, max=MAX_POSITION_WEIGHT)
]
return opt.calculate_optimal_portfolio(objective, constraints, empty_portfolio)
```

In [13]:

```
constrained_concentration_pf = optimal_portfolio_constrained_leverage_and_concentration()
plot = draw_asset_barplot(
constrained_concentration_pf,
"Optimal Portfolio (Constrained Leverage and Concentration)",
{'ylim': (-MAX_POSITION_WEIGHT - 0.1, MAX_POSITION_WEIGHT + 0.1)},
)
plot.axhline(MIN_POSITION_WEIGHT, color='r', linestyle='dashed')
plot.axhline(MAX_POSITION_WEIGHT, color='r', linestyle='dashed');
```

`TWTR`

and `GM`

at 30% of NLV (Net Liquidation Value), and it was forced to cap the short allocations to `MCD`

and `BK`

at 15% NLV. The remaining 10% under the gross leverage cap went to `AAPL`

.

Eyeballing these distributions a bit, we might notice that we're heavily exposed to the technology and food service industries. We can confirm that concern by using `Pipeline`

to grab sector codes for our portfolio.

In [14]:

```
from quantopian.pipeline import Pipeline
from quantopian.pipeline.classifiers.morningstar import Sector
from quantopian.pipeline.filters import SpecificAssets
from quantopian.research import run_pipeline
def get_sector_names(stocks, date):
"""Get sector codes for a set of stocks for a particular date."""
result = run_pipeline(
Pipeline({'sector': Sector()}, screen=SpecificAssets(stocks)),
start_date=date,
end_date=date
)
# result will have a pd.MultiIndex whose first level just contains `date`.
# Drop the first level so that we just get assets as our index.
sector_codes = result['sector'].reset_index(level=0, drop=True)
# Convert integral sector codes to human-readable strings by doing a lookup
# from Sector.SECTOR_NAMES, which is a dict mapping codes to strings.
#
# This is purely for readability of the charts below.
# We could proceed with the rest of the notebook using sector codes and it would work just fine.
return sector_codes.map(Sector.SECTOR_NAMES)
```

In [15]:

```
sector_names = get_sector_names(universe, '2016-10-07')
sector_names
```

Out[15]:

In [16]:

```
def plot_sector_exposure(pf, sectors):
return draw_barplot(
pf.groupby(sectors).sum(),
'Portfolio Concentration by Sector',
);
plot_sector_exposure(constrained_concentration_pf, sector_names);
```

As we suspected, we're heavily exposed to the technology sector, and we're moderately exposed to cyclical consumer goods.

We can tell the optimizer to try to build a sector neutral portfolio by adding a `NetPartitionExposure`

constraint:

In [17]:

```
print opt.NetPartitionExposure.__doc__
```

In [18]:

```
ALL_SECTOR_NAMES = Sector.SECTOR_NAMES.values()
def optimal_portfolio_sector_neutral():
"""
Calculate the optimal portfolio if we're constrained by leverage, concentration,
and sector neutrality.
"""
objective = opt.MaximizeAlpha(expected_returns)
constraints = [
opt.MaxGrossLeverage(1.0),
opt.PositionConcentration.from_constants(min=MIN_POSITION_WEIGHT, max=MAX_POSITION_WEIGHT),
opt.NetPartitionExposure(
labels=sector_names,
min_weights={sector: -0.0001 for sector in ALL_SECTOR_NAMES},
max_weights={sector: 0.0001 for sector in ALL_SECTOR_NAMES},
)
]
return opt.calculate_optimal_portfolio(objective, constraints, empty_portfolio)
```

In [19]:

```
optimal_sector_neutral_pf = optimal_portfolio_sector_neutral()
plot = draw_asset_barplot(
optimal_sector_neutral_pf,
'Optimal Portfolio (Sector Neutral)',
{'ylim': (-MAX_POSITION_WEIGHT - 0.1, MAX_POSITION_WEIGHT + 0.1)},
)
plot.axhline(MIN_POSITION_WEIGHT, color='r', linestyle='dashed')
plot.axhline(MAX_POSITION_WEIGHT, color='r', linestyle='dashed');
```

In [20]:

```
plot_sector_exposure(optimal_sector_neutral_pf, sector_names);
```

`TWTR`

position with shorts in `AAPL`

in `MSFT`

. The optimizer also reigns in the large long in `MCD`

from 30% to 15%, because our 15% short exposure constraint prevents us from getting a equal-sized hedge in `QSR`

, which is the only other `CONSUMER_CYCLICAL`

stock in our universe. That leaves 10% of our capital unallocated, so the optimizer divides it evenly into a pair on `BP`

and `XOM`

.

Many algorithms written on Quantopian today perform stock selection via a process other than explicit returns/alpha-modeling. Pair-trading algorithms, for example, work by identifying pairs of cointegrated stocks and entering into a long position in one half of the pair and a short position in the other half of the pair.

For these existing algorithms (or for new algorithms written in a style other than alpha-modeling), it may not be natural to express portfolio construction in terms of an explicit alpha-maximization problem. Nevertheless, it may still be useful to frame the portfolio construction process as an optimization problem using different objectives and constraints.

Suppose we have an algorithm that trades in four pairs and periodically decides which of the pairs it should allocate capital toward. Via its own internal logic, the algorithm arrives at a "target portfolio" that expresses its ideal holdings. The algorithm wants to trade from its current holdings into its ideal holdings, but it doesn't turn over its entire portfolio at once because doing so may incur excessive transaction costs. The algorithm also wants to ensure that it maintains a net exposure of 0 in all of its pairs.

We can frame this as an optimization problem as follows:

Minimize the distance between the new portfolio and the ideal portfolio, subject to the constraints that:

- The long components of our pairs have a weights $\geq$ 0.
- The short components of our pairs have weights $\leq$ 0.
- The sum of the weights in each of our pairs is 0.
- The sum of the absolute values of the changes in weight across all stocks is less than our max turnover.

In fancy math language, we can say the following:

Given a vector $W_{current}$ of current portfolio weights, a vector $W_{ideal}$ of ideal portfolio weights, a maximum turnover rate $T_{max}$, and a list of pairs of long/short stocks $P = [(l_1, s_1), (l_2, s_2), \dots, (l_n, s_n)]$, choose a vector $w$ of weights that solves the following system:

\begin{align} \text{minimize}&& \\ &&||w - W_{ideal}||_2 \\ \text{subject to}&& \\ &&w_l &\geq 0&& l \in [l_1, l_2, \dots, l_n] \\ &&w_s &\leq 0&& s \in [s_1, s_2, \dots, s_n] \\ &&w_l + w_s &= 0&& (l, s) \in P \\ &&||w - W_{current}||_1 &\leq T_{max}&& \\ \end{align}(In the above, $||v||_1$ is the $L^1$ norm of $v$, or the "Taxicab Norm", and $||v||_2$ is the $L^2$ norm of $v$, i.e, the standard Euclidean Norm.)

`TargetPortfolio`

objective and the `Pair`

and `MaxTurnover`

constraints.

In [21]:

```
from itertools import chain
pairs = [
symbols(['AAPL', 'MSFT']),
symbols(['GM', 'F']),
symbols(['MCD', 'QSR']),
symbols(['BP', 'XOM']),
]
# Flatten the list of pairs into a single list for use as an index.
all_pair_stocks = list(chain.from_iterable(pairs))
# Initially allocate half of our capital in the AAPL/MSFT pair and half in the GM/F pair.
initial_pairs_portfolio = pd.Series({
pairs[0][0]: 0.25, # AAPL
pairs[0][1]: -0.25, # MSFT
pairs[1][0]: 0.25, # GM
pairs[1][1]: -0.25, # F
}).reindex(
all_pair_stocks,
fill_value=0.0,
)
draw_asset_barplot(initial_pairs_portfolio, 'Initial Pairs Portfolio');
```

`AAPL/MSFT`

pair and enter the `MCD/QSR`

and `BP/XOM`

pairs. It might calculate a new target portfolio that's equally-weighted between the three new pairs.

In [22]:

```
target_pairs_portfolio = pd.Series({
pairs[0][0]: 0.16666,
pairs[0][1]: -0.16666,
pairs[2][0]: 0.16666,
pairs[2][1]: -0.16666,
pairs[3][0]: 0.16666,
pairs[3][1]: -0.16666,
}).reindex(
all_pair_stocks,
fill_value=0.0,
)
draw_asset_barplot(target_pairs_portfolio, 'Target Pairs Portfolio');
```

`calculate_target_weights`

using an unconstrained `TargetPortfoio`

objective, we'll get back our target. But if we supply turnover and pairs constraints, we'll move our capital incrementally out of the current pairs and into the new ones:

In [23]:

```
print opt.TargetPortfolio.__doc__
```

In [24]:

```
print opt.Pair.__doc__
```

In [25]:

```
print opt.MaxTurnover.__doc__
```

In [26]:

```
def migrate_pairs_constrain_turnover(current_pf, target_pf, max_turnover):
"""Calculate a new target portfolio while respecting turnover and pair constraints."""
objective = opt.TargetPortfolio(target_pf)
constraints = [opt.Pair(long, short) for long, short in pairs]
constraints.append(opt.MaxTurnover(max_turnover))
return opt.calculate_optimal_portfolio(
objective=objective,
constraints=constraints,
current_portfolio=current_pf,
)
```

In [27]:

```
# Try changing with the `MAX_TURNOVER` parameter here to see how it affects
# the resulting portfolio!
MAX_TURNOVER = 0.75
migrated_pairs_portfolio = migrate_pairs_constrain_turnover(
current_pf=initial_pairs_portfolio,
target_pf=target_pairs_portfolio,
max_turnover=MAX_TURNOVER,
)
```

In [28]:

```
before_and_after_frame = pd.DataFrame(
{
'Initial Weight': initial_pairs_portfolio,
'Migrated Weight': migrated_pairs_portfolio,
'Target Weight': target_pairs_portfolio,
},
# Explicitly set a column order so that the plot below is
# created in the order we want.
columns=['Initial Weight', 'Migrated Weight', 'Target Weight']
)
ax = draw_asset_barplot(
before_and_after_frame,
# The fancy brace syntax here is a format specification.
# See https://docs.python.org/2/library/string.html#formatspec for more info.
'Optimal Portfolio: {0:.0f}% Max Turnover '.format(MAX_TURNOVER * 100),
);
```

`MaxTurnover`

constraint, it attempts to equalize the size of the difference between each stock's output weight and it's target weight.

In [29]:

```
print "Total Turnover: %f\n" % (migrated_pairs_portfolio - initial_pairs_portfolio).abs().sum()
print "Absolute Difference from Target Weight:\n"
print (migrated_pairs_portfolio - target_pairs_portfolio).abs()
```

This notebook provided an overview of the APIs currently available in ** quantopian.experimental.optimize**. There's still a lot of work to do and many questions to answer before we can move beyond experimental status:

The

`optimize`

module is currently only available in the research environment, not in algorithms. What changes are necessary to make`optimize`

available for algorithms? In particular, are there additional APIs that would be valuable just for algorithms? For example, the need to pass in your current portfolio is unnecessary in the context of an algorithm, since the algorithm executor already knows what the current portfolio is. The process of ordering based on the results of an optimization could also be streamlined via new built-ins like an`order_target_portfolio`

method.In the example above, we had to "manually" build a sector neutral constraint by using Pipeline to query for sector codes, formatting the results, and passing them to

`NetPartitionExposure`

. Common applications like this could be simplified with "data-aware" constraints that could fetch data at optimization time. A built-in`SectorNeutral`

constraint, for example, might internally use Pipeline machinery to fetch morningstar sector codes without requiring any additional input from users. Are there other examples of built-in constraints that could benefit from such a treatment? How should data-aware constraints work in notebooks where there isn't an implicit simulation date?How can the existing Constraints and Objectives be simplified or improved?

- What new Constraints should be implemented?
- What new Objectives should be implemented?

With help from the Quantopian community, we hope to answer these questions in the coming weeks and months. I'm excited about the possibilities that new APIs create for algorithm authors, and I look forward to hearing feedback and seeing what the community builds with these tools.