better-market-betas

Simpler Better Market Betas

Overview Information:: https://www.ivo-welch.info/research/betas/
Sample Use:: $ make ; ./mkdistributable < crspsample.csv

Directory Contents

File Name ↓	File Size ↓	Date ↓
Parent directory/	-	-
Makefile	200 B	2022-Feb-12 02:15
mkdistributable-post.R	424 B	2022-Feb-12 21:34
snippet.R	602 B	2020-Jul-09 19:45
csv2sasstata.R	859 B	2020-Aug-18 20:53
mkdistributable-post.Rout	1.1 KiB	2022-Feb-12 21:34
mkblock.R	1.2 KiB	2020-Jul-09 03:04
wuni.hh	1.2 KiB	2020-Jul-09 22:44
fregr.hh	1.4 KiB	2022-Feb-12 01:17
winsrel-fast.R	1.5 KiB	2020-Jul-09 19:43
mksd-by-permno.R	2.9 KiB	2020-Jul-09 22:44
mkblock.Rout	4.1 KiB	2020-Jul-09 03:07
mksigma.cc	4.6 KiB	2020-Jul-09 22:44
fregr.cc	4.9 KiB	2020-May-16 19:11
mkdistributable.cc	8.2 KiB	2022-Feb-12 18:13
mkdistributable	51.0 KiB	2022-Feb-12 21:30
crspsample.csv	60.9 KiB	2022-Feb-12 01:11
betas-2020.csv	106.7 KiB	2022-Feb-12 21:34
betas-2021.csv	124.5 KiB	2022-Feb-12 21:34
betas.csv.gz	27.3 MiB	2022-Feb-12 21:31

check-tesla/: Sample calculations for Tesla to 2014-12-31.
code/: Generating Code

Market-Beta

The market-beta estimates are the result of a years-long academic study. These bswa32 market-beta estimates are known to be far better than those from Bloomberg-Merrill-Lynch (Capital IQ or Yahoo-Finance or Google-Finance), Vasicek, Dimson, industry, or any other market-beta estimate when it comes to forecasting future OLS betas (over the next 1 to 12 months, and beyond). Note that regardless of econometric estimator, it is this future not-yet-known to-be-realized OLS beta that most investors care about, because it measures the to-be-realized hedge against market-factor risk. (The lagged OLS beta is not as good a predictor of its own future self as the bswa32 estimator.)

To accomplish its performance, the bswa market-beta estimator does three things:

it uses daily stock returns, not monthly stock returns as inputs;
it ages past returns in a smooth exponental fashion; and
it removes outliers in a novel (slope-winsorized) manner that avoids biases.

Although the inputs are daily, the files only report month-end statistics. If you need intra-month statistics, either run the code yourself, or just take a weighted average of the surrounding month-end measures.

For more detail, please confer https://ssrn.com/abstract=3371240.

Caveats

Caveat: Do not believe that better betas make the CAPM work. No (ex-ante) market-beta has reliably predicted future average returns in the past (as suggested not only by the CAPM but almost any sensible model). Recall what beta truly is: it is not a measure of expected returns, but a measure of the market-hedge provided by individual stocks.
The leap to think this risk should influence expected returns makes sense but it is a leap that is not supported by the data. Nevertheless, beta is useful to improve portfolio performance, but in a portfolio optimization context through the second moment, not through the first moment.

Standard Deviations

The occasionally-provided standard-deviation estimates, sd0111 are very good estimates of the 1-month ahead plain standard deviation. If someone can find a simple predictor of the one-month ahead plain standard deviation for the CRSP universe that is economically better, please let me know. (No intra-day data and/or implied vol-based estimators, please, because this data neither covers enough securities nor is sufficiently widely available.)

Other Details

Timing:

These files provide only estimates of prevailing in-time [a] market-betas and [b] daily rate-of-return standard deviations.

In-time means the estimates are calculated with data only up to this point in time. No future data has been used.

The estimates are forecasts of the 1-12-months ahead plain OLS market-betas (and plain standard deviations). That is, they are noisy estimates of the true but unknown prevailing market-betas and plain standard deviations at the end of the quoted month.

Coverage:

In 2020, there were 4,465,811 monthly market-beta observations by permno-month, ranging from 1926/07 to 2019/12, growing by about 4,500 x 12 every year. The compressed file was about 23MB. (There are fewer observations when stock identification is not by permno.) My intent is to update the data once a year.

Although the database contains market-beta estimates early on (i.e., in months with as-of-yet few daily return observations), it is advisable not to use market-betas when they are based on too few returns. A good filter is to use only months that also have standard-deviation observations in the database. The latter requires at least one year's worth of data in order not to be set missing. This helps with market-beta reliability.

Utilized Source Inputs:

The only data used to create the betasd-by-permno.csv.gz file is CRSP. Compustat information does not materially improve the estimates --- despite some claims by earlier papers to the contrary.

Creating Programs:

mkdistributable.cc. (The block sampler is in mkblock.R. Because its block-sampled [=moving window] estimates are worse, you need to run it yourself if you insist on them.) (The programs rely on regression code (fregr), and a basic (ideally pre-cleaned) CRSP daily database (with variable names explained in mkdistributable.cc), and another R program calculating standard deviations [not used at the moment].)

Output:

The key output is the above gzip-compressed csv file. On linux and macos, use `gunzip` to decompress it. On Windows, you need to use a 3rd-party decompression program, but most have this built-in, because .gz is one of the oldest formats.

The betas.csv.gz file has too many lines to fit into excel, but it will read fine into R.

Illustrated File Format (Content):

betasd-by-permno.csv.gz:

tic,	permno,	yyyymmdd,	n,	bswa32
AAC,	14944,	20211231,	1801,	0.654
AAMD,	85390,	20211231,	6111,	1.548
AAME,	15579,	20211231,	25227,	1.187
AAN,	20062,	20211231,	279,	1.127
AAP,	89216,	20211231,	5067,	1.192
...

Debug:

Directory check-tesla shows calculations for Tesla 2014-12 as an example. If you want to rewrite code, please check that we agree on Tesla first. Note: Because older days are aged (downweighted), the bswa betas can use all data since inception, not just a few months or years.

Thanks:

Thanks to CRSP for providing the input data for these calculations and WRDS for making it easy to use their data.