Alphalens is designed to aid in the analysis of "alpha factors," data transformations that are used to predict future price movements of financial instruments. Alpha factors take the form of a single value for each asset on each day. The dimension of these values is not necessarily important. We evaluate an alpha factor by considering daily factor values relative to one another.

It is important to note the difference between an alpha factor and a trading algorithm. A trading algorithm uses an alpha factor, or combination of alpha factors to generate trades. Trading algorithms cover execution and risk constraints: the business of turning predictions into profits. Alpha factors, on the other hand, are focused soley on making predictions. This difference in scope lends itself to a difference in the methodologies used to evaluate alpha factors and trading algorithms. Alphalens does not contain analyses of things like transaction costs, capacity, or portfolio construction. Those interested in more implementation specific analyses are encouaged to check out pyfolio (https://github.com/quantopian/pyfolio), a library specifically geared towards the evaluation of trading algorithms.

In [1]:

```
import numpy as np
import pandas as pd
from quantopian.research import run_pipeline
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import CustomFactor, Returns, AverageDollarVolume
from quantopian.pipeline.classifiers.morningstar import Sector
```

In [2]:

```
universe_screen = AverageDollarVolume(window_length=20).top(500)
```

In [3]:

```
pipe = Pipeline(
columns={
'Momentum' : Returns(window_length=252, mask=universe_screen),
'Sector': Sector(mask=universe_screen),
},
screen=universe_screen
)
```

In [4]:

```
results = run_pipeline(pipe, '2015-06-30', '2016-06-30')
results = results.fillna(value=0.)
```

In [5]:

```
momentum_factor = results["Momentum"]
momentum_factor.head()
```

Out[5]:

The pricing data passed to alphalens should reflect the next available price after a factor value was observed at a given timestamp. The price must not be included in the calculation of the factor for that time. Always double check to ensure you are not introducing lookahead bias to your study.

In our example, before trading starts on 2014-12-2, we observe yesterday, 2014-12-1's factor value. The price we should pass to alphalens is the next available price after that factor observation: the open price on 2014-12-2.

In [6]:

```
assets = results.index.levels[1].unique()
# We need to get a little more pricing data than the
# length of our factor so we can compare forward returns.
# We'll tack on another month in this example.
pricing = get_pricing(assets, start_date='2015-06-30', end_date='2016-07-31', fields='open_price')
```

In [7]:

```
pricing.head()
```

Out[7]:

Often, we'd want to know how our factor looks across various sectors. To generate sector level breakdowns, you'll need to pass alphalens a sector mapping for each traded name.

This mapping can come in the form of a MultiIndexed Series (with the same date/symbol index as your factor value) if you want to provide a sector mapping for each symbol on each day.

If you'd like to use constant sector mappings, you may pass symbol to sector mappings as a dict.

If your sector mappings come in the form of codes (as they do in this tutorial), you may also pass alphalens a dict of sector names to use in place of sector codes.

In [8]:

```
MORNINGSTAR_SECTOR_CODES = {
-1: 'Misc',
101: 'Basic Materials',
102: 'Consumer Cyclical',
103: 'Financial Services',
104: 'Real Estate',
205: 'Consumer Defensive',
206: 'Healthcare',
207: 'Utilities',
308: 'Communication Services',
309: 'Energy',
310: 'Industrials',
311: 'Technology' ,
}
```

In [9]:

```
sectors = results["Sector"]
```

In [10]:

```
import alphalens
```

Alphalens contains a handy data formatting function to transform your factor and pricing data into the exact inputs expected by the rest of the plotting and performance functions. This `get_clean_factor_and_forward_returns`

function is the first call in `create_factor_tear_sheet`

.

In [13]:

```
factor, forward_returns = alphalens.utils.get_clean_factor_and_forward_returns(momentum_factor,
pricing,
groupby=sectors,
groupby_labels=MORNINGSTAR_SECTOR_CODES,
periods=(1,5,10))
```

Let's see what that gave us...

In [14]:

```
factor.head()
```

Out[14]:

In [15]:

```
forward_returns.head()
```

Out[15]:

You'll notice that our factor doesn't look much different. The only addition here is an index level describing the sector of each name. That will come in handy as we perform sector level reductions in our performance and plotting functions.

The forward_returns dataframe represents the mean daily price change for the N days after a timestamp. The 1 day forward return for AAPL on 2014-12-2 is the percent change in the AAPL open price on 2014-12-2 and the AAPL open price on 2014-12-3. The 5 day forward return is the percent change from open 2014-12-2 to open 2014-12-9 (5 trading days) divided by 5.

In [16]:

```
quantized_factor = alphalens.performance.quantize_factor(factor)
```

In [17]:

```
quantized_factor.head()
```

Out[17]:

In [21]:

```
mean_return_by_q_daily, std_err = alphalens.performance.mean_return_by_quantile(quantized_factor, forward_returns,
by_group=False,
by_date='D')
```

In [22]:

```
mean_return_by_q_daily.head()
```

Out[22]:

In [24]:

```
mean_return_by_q, std_err_by_q = alphalens.performance.mean_return_by_quantile(quantized_factor,
forward_returns,
by_group=False)
```

In [25]:

```
mean_return_by_q.head()
```

Out[25]:

In [26]:

```
alphalens.plotting.plot_quantile_returns_bar(mean_return_by_q);
```

In [27]:

```
alphalens.plotting.plot_quantile_returns_violin(mean_return_by_q_daily);
```

In [28]:

```
quant_return_spread, std_err_spread = alphalens.performance.compute_mean_returns_spread(mean_return_by_q_daily, 5, 1, std_err)
```

In [49]:

```
try:
alphalens.plotting.plot_mean_quantile_returns_spread_time_series(quant_return_spread, std_err_spread, ax=None);
except Exception:
pass
```

In [50]:

```
alphalens.plotting.plot_cumulative_returns_by_quantile(mean_return_by_q_daily);
```

In [51]:

```
ls_factor_returns = alphalens.performance.factor_returns(factor, forward_returns)
```

In [52]:

```
ls_factor_returns.head()
```

Out[52]:

In [53]:

```
alphalens.plotting.plot_cumulative_returns(ls_factor_returns[1]);
```

In [58]:

```
alpha_beta = alphalens.performance.factor_alpha_beta(factor, forward_returns,
factor_returns=ls_factor_returns)
```

In [59]:

```
alpha_beta
```

Out[59]:

Information Analysis is a way for us to evaluate the predicitive value of a factor without the confounding effects of transaction costs. The main way we look at this is through the Information Coefficient (IC).

To learn more about the Information Coefficient and Spearman Rank Correlation check out the Spearman Rank Correlation lecture from the Quantopian Lecture Series.

In [60]:

```
ic = alphalens.performance.factor_information_coefficient(factor, forward_returns)
```

In [61]:

```
ic.head()
```

Out[61]:

In [65]:

```
alphalens.plotting.plot_ic_ts(ic);
```