Notebook

Quantpedia Series: Reversal during Earnings Announcements

By Nathan Wolfe

This research is published in partnership with Quantpedia, an online resource for discovering new trading ideas.

You can view the full Quantpedia series in the library along with other research and strategies.


</a>

Before Proceeding: Click here to import necessary functions


Whitepaper authors: Eric C. So (ESo@mit.edu), Sean Wang

Whitepaper source: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2275982

Abstract

From Quantpedia:

> In general, reversal in price of an asset occurs due to investors' overreaction to asset-related news and the subsequent price correction. In this case, the most probable reason for the phenomenon, according to the authors, is the market makersā€˜ aversion to inventory risks that tend to increase dramatically in the pre-announcement period. Consequently, the market makers demand higher compensation for providing liquidity due to higher risk and therefore raise prices, which are expected to reverse after the earnings announcement.

As the paper does, I find evidence of returns reversal during earnings announcements; while the paper tested using data from 1996 to 2011, I used data from 2007 to 2016. The average reversal among all stocks in my data is 0.449%, compared to a result of 1.448% in the paper. I found that we can reasonably increase the reversal to 0.6% by selecting firms based on a minimum average dollar volume percentile, or based on a minimum market cap.

Introduction

>Several market frictions have the potential to significantly impact the efficiency and information content of market prices. This study focuses on the friction that arises from the need to locate a counterparty in order to complete a trade. Market makers typically mitigate this friction by matching would-be sellers with would-be buyers. When there is an imbalance between the quantities sought by buyers and sellers at a given price, market makers may absorb the order imbalance into their own account by serving as the trade counterparty. This practice is commonly known as liquidity provision.

Upon anticipation of a good (bad) earnings report, investors want to go long (short) in the stock; they often don't consider whether the stock is currently over- or under-priced, simply assuming that it will get a positive (negative) bump upon the release of the earnings report. Meanwhile, market makers see an imbalance in demand for long (short) positions; they thus raise (lower) the price to compensate for providing the liquidity necessary to fill the imbalance. Investors usually don't care about this price change, assuming the bump will come from the report.

After the earnings announcement, the demand imbalance should disappear, and with it the market makers' need for the price adjustment. Thus the market makers reverse the recent price change which causes a short-term reversal.

Our strategy will be to assume that new information provided by the release of the earnings report will be neutral on average. All that remains is to take advantage of the market makers' adjustment of the stock price, by taking an opposite position and waiting for them to reverse the change.

>We show that a long (short) position in firms whose returns strongly underperform (outperform) the market in the three days prior to earnings announcements yields an average return of 145 basis points (bps) during the announcement window. By comparison, the average return to a comparable portfolio during non-announcement periods is 22 bps, indicating that return reversals increase more than six-fold during earnings announcements.

While I didn't compare to non-announcement conditions, I did find clear low/high reversals for the best/worst performing securities prior to earnings. I also examined how various universe selection conditions would affect the returns of this strategy; while the authors selected based on market cap, I explored using average dollar volume as well as our Q500US and Q1500US liquid stock universes finding that a minimum average dollar volume filter and Q500US universe both performed well as universe selectors.

Table of Contents

You can navigate the rest of this notebook as follows:

  1. Methodology and Emprical Tests
  2. Robustness Tests
  3. Strategy Creation
  4. Conclusion
In [3]:
# Basic Variables
END = pd.Timestamp.now('US/Eastern').replace(tzinfo=utc) - pd.Timedelta('7d')

# For the free set, comment the above line and uncomment the below:
#END = pd.Timestamp.now('US/Eastern').replace(tzinfo=utc) - pd.Timedelta('731d')

START = pd.Timestamp('2007-01-01', tz=utc)
BENCHMARK = symbols('SPY')
RETURNS_QUANTILES = 5
PRICE_WINDOW = 8
MAX_DAYS_TO_HOLD = 20
PRICE_DAY_OFFSETS = range(-1, MAX_DAYS_TO_HOLD + 1)

# plot colors
color = '#0062AE'

</a>

1. Methodology and Empirical Tests

The methodology behind the study is based on the idea that a stock's short-term returns will reverse during an earnings announcement as an inbalance in demand dissipates. The data used in this Research Notebook is sourced from EventVestor's Earnings Calendar Dataset. The sample version is available from 2007 up till 2 years ago while the premium version is available up till present day.

Data Aggregation & Sample Selection

Below, a Pipeline pulls all earnings announcements of which prior knowledge of the date existed, along with data on the day before each about the company's average dollar volume, market cap, sector, and short-term returns.

I eliminate firms with prices below $5 to mitigate the influence of bid-ask bounce on our calculation of return reversals as noted in the paper. Our data starts on 2007-01-01 and spans up through the present day (2 years ago for the free set).

Definition: "PAR" stands for "Pre-Announcement Returns," referring to the stocks' short-term returns prior to their earnings announcements.

In [4]:
# Basic universe filters
tradable_filter = IsPrimaryShare() & mstar.valuation.market_cap.latest.notnull()
days_filter = BusinessDaysUntilNextEarnings().eq(1)
price_filter = USEquityPricing.close.latest > 5

# Factors for returns and liquidity
adv = AverageDollarVolume(mask=USEquityPricing.volume.latest > 0, window_length=30)
vlr = variable_lookback_returns(10, mask=adv.notnan())
market_cap = mstar.valuation.market_cap.latest

# Pipeline itself to calculate PAR
pipe = Pipeline(columns={'adv_percentile': adv.quantiles(100),
                         'sector': mstar.asset_classification.morningstar_sector_code.latest,
                         'q500us': Q500US(),
                         'q1500us': Q1500US(),
                         'market_cap': market_cap,
                         'market_cap_percentile': market_cap.quantiles(100),
                         # -3 to -2
                         'par_2': vlr.par_2.quantiles(RETURNS_QUANTILES),
                         # Used by the authors in original study: -4 to -2 
                         'par_3': vlr.par_3.quantiles(RETURNS_QUANTILES),
                         # -5 to - 2
                         'par_4': vlr.par_4.quantiles(RETURNS_QUANTILES),
                         # -6 to -2
                         'par_5': vlr.par_5.quantiles(RETURNS_QUANTILES),
                         # -7 to -2
                         'par_6': vlr.par_6.quantiles(RETURNS_QUANTILES),
                         # -8 to -2
                         'par_7': vlr.par_7.quantiles(RETURNS_QUANTILES),
                         # -9 to -2
                         'par_8': vlr.par_8.quantiles(RETURNS_QUANTILES),
                         # -10 to -2
                         'par_9': vlr.par_9.quantiles(RETURNS_QUANTILES),
                         # -11 to -2
                         'par_10': vlr.par_10.quantiles(RETURNS_QUANTILES),
                         },
                screen=days_filter & tradable_filter & price_filter & (vlr.par_10.notnan()))

data = split_run_pipeline(pipe, START, END, 16)

# Get before/ after pricing data for all announcement events using `get_pricing`.
cal = tradingcalendar.get_trading_days(START - pd.Timedelta('20d'),
                                       END + pd.Timedelta(days=MAX_DAYS_TO_HOLD * 2))
price_data = data.groupby(level=0).apply(fill_prices)
price_data_matched = price_data.reindex(data.index)
print 'Done fetching data.'
Done fetching data.
In [5]:
print "This sample consists of %s earnings announcements spanning %s to %s." % (len(data),
                                                                                START.year, END.year)
This sample consists of 83005 earnings announcements spanning 2007 to 2016.

Sample Statistics

> The extreme quintiles of PAR (i.e., quintiles Q5 and Q1) consist of firms that are generally smaller, possess lower book-to-market ratios and share prices, and have higher volatility and relative spreads.

Similar to the original whitepaper, I find that the lowest and highest quantiles are generally lower in market cap.

In [6]:
raw_data = data.reset_index()

raw_data.groupby("par_3").mean()[['market_cap', 'adv_percentile', 'market_cap_percentile']].iloc[1:]
Out[6]:
market_cap adv_percentile market_cap_percentile
par_3
0 4.417921e+09 67.095305 58.029011
1 7.338278e+09 71.215104 64.950549
2 8.365436e+09 72.470233 66.930130
3 7.918619e+09 72.070200 66.201380
4 5.741773e+09 68.866222 60.510481

</a>

Empirical Tests

This section documents the reversal effect during earnings by examining the spreads between the worst and best performing stocks by PAR returns quantile (0 vs 4 respectively). I find a final spread of 0.449% compared to the authors' final value of 1.448%.

The expected returns will be the first quantile minus the returns on the last quantile. This difference I will call the "spread," and will use "spread" and "returns" interchangeably.

Return spreads by PAR quantile

Below is a comparison average of t-1 to t+1 returns when all equities are divided into quantiles based on their returns in the 3 days prior, as suggested by the paper. It looks like we do have some spread between the first and last quantiles.

In [7]:
baskets = np.zeros(RETURNS_QUANTILES)
for i in range(RETURNS_QUANTILES):
    baskets[i] = returns(2, data['par_3'] == i).mean()
plt.bar(range(RETURNS_QUANTILES), baskets, color=color, alpha=.8)
plt.xlabel('t-4 to t-2 returns quantile')
plt.ylabel('average t-1 to t+1 returns')
plt.title('Average returns during earnings announcement by PAR quantile')
print 'Spread: %f' % calc_spread(3, 2)
Spread: 0.004516

The strategy that the paper suggests is to go long in the first quantile and short in the last quantile for a market-neutral strategy. Given the spread here, it looks like that approach may be fruitful. Compare our spread of 0.449% with the paper's final value of 1.448%. Perhaps the alpha has decayed since 2011, but there are still optimizations which might improve our returns.

The main point here is that the firms with the highest short-term returns prior to the announcement have the lowest short-term returns following their announcement, and vice-versa. This is illustrated by the following plot; note how the returns curves seem to bounce backward from their trend from right before the announcement.

In [8]:
means = pd.DataFrame(columns=range(RETURNS_QUANTILES), dtype=float)

for q in means.columns:
    subset = price_data_matched.loc[data['par_3'] == q]
    means[q] = (subset.transpose() / subset[0]).loc[-1:].mean(axis=1)

means.plot()
plt.xlabel('days since t-1')
plt.ylabel('mean price, normalized at t-1')
plt.title('Average cumulative returns over time by PAR quantile');

</a>

2. Robustness Tests

Up till now, my research supports that there is evidence of reversals in the best and worst performing stocks prior to an earnings announcement. This section is dedicated to examining the consistency of these results through a number of dimensions: Liquidity, Sector, and Liquidity + Sector.

In [9]:
years = range(START.year, END.year + 1)
baskets = pd.Series(index=years)
for y in years:
    baskets[y] = calc_spread(5, 2, data.index.get_level_values(0).year == y)
plt.bar(years, baskets, alpha=.9, color=color)
plt.xlabel('year')
plt.ylabel('average spread')
plt.title('Returns by year');

The consistency is variable as you'd expect from an events based strategy. Additionally, as noted in the main study section, there appears to be some alpha decay in the returns over time. Consistency is quantified by dividing the mean spread (returns) by the standard deviation of the yearly spreads as above which gives a primitive Shapre ratio.

Consistency by Liquidity

In [10]:
years = range(START.year, END.year + 1)
adv_sharpes = pd.Series(index=range(100))
for adv in adv_sharpes.index:
    baskets = pd.Series(index=years)
    for y in years:
        try:
            baskets[y] = calc_spread(5, 2, (data['adv_percentile'] >= adv)
                                     & (data.index.get_level_values(0).year == y))
        except KeyError:
            baskets[y] = np.nan
    adv_sharpes[adv] = calc_spread(5, 2, data['adv_percentile'] >= adv) / baskets.std()
adv_sharpes.head()
Out[10]:
0    0.592761
1    0.592761
2    0.592718
3    0.590417
4    0.591118
dtype: float64

Plotting the values below, it seems that low ADV floors give us higher consistency (possibly because more events are available), but consistent returns also increase for higher ADV floors because of the strategy's better performance for higher ADV values.

In [11]:
plt.bar(range(100), adv_sharpes, color=color)
plt.xlabel('minimum ADV percentile')
plt.ylabel('"Sharpe" ratio')
plt.title('Consistent returns by ADV percentile floor');

Consistency by Sector

In [12]:
# Event Counts by Sector
data['sector'].value_counts()
Out[12]:
 310    13047
 311    12994
 102    12333
 103    12264
 206     9258
 309     5608
 104     4987
 101     4448
 205     3889
 207     2719
 308     1299
-1        159
dtype: int64

"-1" for the sector means uncategorized. Since there are so few such events, we'll exclude them from this analysis.

In [13]:
SECTORS = [101, 102, 103, 104, 205, 206, 207, 308, 309, 310, 311]
SECTOR_NAMES = pd.Series({
 101: 'Basic Materials',
 102: 'Consumer Cyclical',
 103: 'Financial Services',
 104: 'Real Estate',
 205: 'Consumer Defensive',
 206: 'Healthcare',
 207: 'Utilities',
 308: 'Communication Services',
 309: 'Energy',
 310: 'Industrials',
 311: 'Technology' ,
})
baskets = pd.Series(index=range(len(SECTORS)))
for i in range(len(SECTORS)):
    baskets[i] = calc_spread(5, 3, data['sector'] == SECTORS[i])
plt.bar(baskets.index, baskets, color=color)
plt.xticks(range(len(SECTORS)), SECTOR_NAMES, rotation=25)
plt.title('Average returns by sector');

The strategy doesn't look too reliant on any particular sector. Communications Services is a little high, but that sector also has relatively few events.

Consistency by Sector & Liquidity

Here I make a final robustness test by examining how the strategy does by sector and also by liquidity filter. I use both ADV percentile thresholds and the Q1500US and Q500US liquid universes.

In [14]:
liquidity_names = ['all', 'adv_50', 'adv_70', 'adv_90', 'q1500us', 'q500us']
liquidity_locs = {
    'all': True,
    'adv_50': data['adv_percentile'] >= 50,
    'adv_70': data['adv_percentile'] >= 70,
    'adv_90': data['adv_percentile'] >= 90,
    'q1500us': data['q1500us'],
    'q500us': data['q500us'],
}
spread_matrix = pd.DataFrame(columns=liquidity_names, index=SECTOR_NAMES, dtype=float)
freq_matrix = pd.DataFrame(columns=liquidity_names, index=SECTOR_NAMES, dtype=int)
for liquidity_filter in liquidity_names:
    for sector in SECTORS:
        loc = liquidity_locs[liquidity_filter] & (data['sector'] == sector)
        spread_matrix.loc[SECTOR_NAMES[sector], liquidity_filter] = calc_spread(5, 3, loc)
        freq_matrix.loc[SECTOR_NAMES[sector], liquidity_filter] = len(data[loc])
freq_matrix.loc['all'] = freq_matrix.sum()
spread_matrix.loc['all'] = (spread_matrix * freq_matrix).sum() / freq_matrix.loc['all']
ax = sns.heatmap(spread_matrix, annot=True)
ax.set(xlabel='liquidity filter', ylabel='sector', title='Returns by sector and liquidity filter')
print 'Event frequency:'
freq_matrix
Event frequency:
Out[14]:
all adv_50 adv_70 adv_90 q1500us q500us
Basic Materials 4448 3904 2652 978 2493 910
Consumer Cyclical 12333 10302 7461 2628 7189 2503
Financial Services 12264 9004 5879 2081 5244 1858
Real Estate 4987 4445 3243 760 2949 659
Consumer Defensive 3889 3172 2382 1299 2283 1260
Healthcare 9258 7714 5217 1723 4807 1593
Utilities 2719 2526 2071 708 1920 642
Communication Services 1299 1115 739 396 702 365
Energy 5608 5150 4073 1829 3299 1652
Industrials 13047 10940 7142 2007 6789 1864
Technology 12994 10615 7069 2517 6566 2339
all 82846 68887 47928 16926 44241 15645

The liquidity filters that perform best are ADV percentile >= 90 and Q500US, and they look fairly consistent across sectors. Sector 308 (communications) is very profitable on average, but it also has relatively few events, so that may be pure chance.

</a>

3. Strategy Creation

With results from both the empirical and robustness tests suggesting a possibility for a viable trading strategy, I take a look at a few of the parameters and attempt to roughly optimize for returns. It's important to note here that the following optimizations have a risk of overfitting the strategy parameters.

The core framework of the strategy is provided by Quantpedia:

>The investment universe consists of stocks listed at NYSE, AMEX, and NASDAQ, whose daily price data are available at CRSP database. Earnings-announcement dates are collected from Compustat. Firstly, the investor sorts stocks into quintiles based on firm size. Then he further sorts the stocks in the top quintile (the biggest) into quintiles based on their average returns in the 3-day window between t-4 and t-2, where t is the day of earnings announcement. The investor goes long on the bottom quintile (past losers) and short on the top quintile (past winners) and holds the stocks during the 3-day window between t-1, t, and t+1. Stocks in the portfolios are weighted equally.

Optimal PAR lookback window

The original paper uses a lookback window of 3 days (t-4 to t-2) and holding for 2 days (t-1 to t+1), but I explore a few other parameters as possible optimizations.

In [15]:
ax = sns.heatmap(create_spread_matrix(), annot=True)
ax.set(title='Cumulative returns by PAR lookback and days since t-1',
       xlabel='PAR lookback',
       ylabel='days since t-1');

The above heatmap suggests that a PAR lookback window of 5 days is strong compared to other values. For the rest of this section, I'll stick to using the PAR lookback of 5 for the best results.

The heatmap above also suggests there may be some returns to be squeezed out by holding for longer periods of time, much longer than 2 days as the paper recommends. The below heatmap shows how marginal returns look for each day when holding for a long period.

Optimal holding period

In [16]:
ax = sns.heatmap(create_spread_matrix(lambda n: n - 1), annot=True)
ax.set(title='Marginal returns by PAR lookback and days since t-1',
       xlabel='PAR lookback',
       ylabel='days since t-1');

Marginal returns are clearly strong for the first 2 days that the stocks are held; marginal returns afterward are small but positive on average. This suggests that a good strategy should prioritize holding stocks for the crucial first 2 days, but could hold on longer if it has excess cash for mild returns.

Market Cap or ADV?

The paper specifies that for the strategy to perform better, I should use only those firms with a high market cap. However, Average Dollar Volume is also a commonly used proxy for liquidity and the choice of either market cap or ADV constraints can have significantly different results for a trading strategy.

In [17]:
baskets = np.zeros(100)
for i in range(100):
    baskets[i] = calc_spread(5, 2, data['market_cap_percentile'] >= i)
plt.bar(range(100), baskets, color=color)
plt.xlabel('minimum market cap percentile')
plt.ylabel('t-1 to t+1 returns spread')
plt.title('Returns increase as market cap floor increases')
plt.ylim([0, 0.012]);

It seems that the paper was accurate in that this strategy works significantly better with high market cap stocks.

But another important factor in the feasibility of the strategy is trading in liquid stocks, and average dollar volume can be used as a proxy to firm size. The same plot above is shown below by ADV percentile rather than market cap.

In [18]:
baskets = np.zeros(100)
for i in range(100):
    baskets[i] = calc_spread(5, 2, data['adv_percentile'] >= i)
plt.bar(range(100), baskets, color=color)
plt.xlabel('minimum ADV percentile')
plt.ylabel('t-1 to t+1 returns spread')
plt.title('Returns increase as ADV floor increases')
plt.ylim([0, 0.012]);

As you can see, our spread (and thus the long-short strategy's returns) increase as we increase the minimum ADV. But obviously this decreases the number of events we have to work with, so there's a trade-off.

Cumulative returns with ADV & PAR lookback parameters

For the rest of our study we'll set a minimum ADV threshold of the 95th percentile to improve our results. Let's try charting the earnings announcement reversal with our optimized PAR lookback of 5 days and this ADV filter.

In [ ]:
means = pd.DataFrame(columns=range(RETURNS_QUANTILES), dtype=float)

for q in means.columns:
    subset = price_data_matched.loc[(data['par_5'] == q) & (data['adv_percentile'] >= 95)]
    means[q] = (subset.transpose() / subset[0]).loc[-1:].mean(axis=1)

means.plot()
plt.xlabel('days since t-1')
plt.ylabel('mean price, normalized at t-1')
plt.title('Average cumulative returns over time by PAR quantile, ADV percentile >= 95');