Simpler Better Market Betas

URL:
https://www.ivo-welch.info/research/betas/
Synopsis:
In-time estimates. The best-known market-beta forecasts and very good in-time return standard deviation estimates. A longer description and download links appear below.
Variable Definitions:
permno from crsp (or ticker os cusip or gvkey)
yyyymmdd  day of calculation
bswa32 the best beta known, (slope-winsorize-parameter 3; age-parameter 2).
sd0111 a very good estimate, 0.5*sd[mo] + 0.35*sd[mo-1] + 0.15*sd[mo-11], where sd[mo] is the monthly standard deviation of daily rates of return within month mo, winsorized at 15%.
License:
Free and unencumbered. Please cite Ivo Welch, Simpler Better Market Betas, UCLA May 2020.

Directory Contents

File Name  ↓ File Size  ↓ Date  ↓ 
Parent directory/--
check-tesla/-2020-Oct-22 23:59
code/-2020-Oct-22 23:59
sas/-2020-Oct-22 23:59
stata/-2020-Oct-22 23:59
betasd-by-gvkey.csv.gz24.2 MiB2020-Jul-09 03:06
betasd-by-ncusip.csv.gz25.1 MiB2020-Jul-09 03:07
betasd-by-permno.csv.gz27.8 MiB2020-Jul-09 03:07
betasd-by-ticker.csv.gz25.4 MiB2020-Jul-09 03:06
  • check-tesla/: Sample calculations for Tesla to 2014-12-31.
  • code/: Generating Code
  • sas/: SAS code with SAS data
  • stata/Stata native data

 

Market-Beta

The market-beta estimates are the result of a years-long academic study. These bswa32 market-beta estimates are known to be better than those from Bloomberg-Merrill-Lynch (Capital IQ or Yahoo-Finance or Google-Finance), Vasicek, Dimson, industry, or any other market-beta estimate when it comes to forecasting future OLS betas (over the next 1 to 12 months, and beyond). Note that regardless of econometric estimator, it is this future not-yet-known to-be-realized OLS beta that most investors care about, because it measures the realized hedge against market-factor risk. (The lagged OLS beta is not as good a predictor of its own future self as the bswa32 estimator.)

To accomplish its performance, the bswa market-beta estimator does three things:

  1. it uses daily stock returns, not monthly stock returns as inputs;
  2. it ages past returns in a smooth exponental fashion; and
  3. it removes outliers in a novel (slope-winsorized) manner that avoids biases.
Although the inputs are daily, the files only report month-end statistics. If you need intra-month statistics, either run the code yourself, or just take a weighted average of the surrounding month-end measures.

For more detail, please confer https://ssrn.com/abstract=3371240.

Caveats

  • Caveat: Do not believe the CAPM. No (ex-ante) market-beta has reliably predicted future average returns in the past (as suggested not only by the CAPM but almost any sensible model). Recall what beta truly is: it is not a measure of expected returns, but a measure of the market-hedge provided by individual stocks.

    The leap to think this risk should influence expected returns makes sense but it is a leap that is not supported by the data. Nevertheless, beta is useful to improve portfolio performance, but in a portfolio optimization context through the second moment, not through the first moment.

  • Caveat: This version uses Gregorian date aging (d/365), the next version will switch to crsp calendar trading day aging (d/256). The difference is small.

Standard Deviations

The provided standard-deviation estimates, sd0111 are very good estimates of the 1-month ahead plain standard deviation. If someone can find a simple predictor of the one-month ahead plain standard deviation for the CRSP universe that is economically better, please let me know. (No intra-day data and/or implied vol-based estimators, please, because this data neither covers enough securities nor is sufficiently widely available.)

Other Details

Timing:
These files provide only estimates of prevailing in-time [a] market-betas and [b] daily rate-of-return standard deviations.

In-time means the estimates are calculated with data only up to this point in time. No future data has been used.

The estimates are forecasts of the 1-12-months ahead plain OLS market-betas (and plain standard deviations). That is, they are noisy estimates of the true but unknown prevailing market-betas and plain standard deviations at the end of the quoted month.

Coverage:
As of 2020, there are 4,465,811 monthly market-beta observations by permno-month, ranging from 1926/07 to 2019/12. The compressed file is about 23MB. (There are fewer observations when stock identification is not by permno.) My intent is to update the data once a year.

Although the database contains market-beta estimates early on (i.e., in months with as-of-yet few daily return observations), it is advisable not to use market-betas when they are based on too few returns. A good filter is to use only months that also have standard-deviation observations in the database. The latter requires at least one year's worth of data in order not to be set missing. This helps with market-beta reliability.

Utilized Source Inputs:
The only data used to create the betasd-by-permno.csv.gz file is CRSP. (betasd-by-gvkey.csv.gz required the CRSP-Compustat link file.)
Creating Programs:
mkwrds1.cc and mkwrds2.R. (The programs rely on regression code (fregr), a pre-cleaned CRSP daily database, and an R program calculating standard deviations.)
Output:
The key output from the mkwrds2.R program are the above gzip-compressed csv files. On linux and macos, use `gunzip` to decompress them. On Windows, you need to use a 3rd-party decompression program, but most have this built-in, because .gz is one of the oldest formats.

The .csv.gz files have too many lines to fit into excel, but will read fine into R. For SAS and Stata users, there are more convenient data versions in their respective folders.

Illustrated File Format (Content):
betasd-by-permno.csv.gz:
permno  , yyyymmdd  , bswa32  , sd0111  
10000   , 19860131  , 1.04    ,         
10000   , 19860228  , 0.53    ,         
10000   , 19860331  , 0.55    ,         
10000   , 19860430  , 0.33    ,         
10000   , 19860530  , 0.26    ,         
10000   , 19860630  , 0.41    ,         
10000   , 19860731  , 0.74    ,         
10000   , 19860829  , 0.51    ,         
10000   , 19860930  , 0.82    ,         
10000   , 19861031  , 0.88    ,         
10000   , 19861128  , 0.73    ,         
10000   , 19861231  , 0.69    , 0.04673 
10000   , 19870130  , 0.39    , 0.03641 
10000   , 19870227  , 0.35    , 0.0293  
...
      
Debug:
Directory check-tesla shows calculations for Tesla 2014-12 as an example. If you want to rewrite code, please check that we agree on Tesla first. Note: Because older days are aged (downweighted), the bswa betas can use all data since inception, not just a few months or years.
Thanks:
Thanks to CRSP for providing the input data for these calculations and WRDS for making it easy to use their data.