Ivo Welch, Octobere 2023

Finance and Portfolio Basics (1h)

Finance Data
Finance Factors
(Abnormal) Performance (Alphas)
Exposures (Betas)
Regressions
Attributes and Signals
Investment Strategies
Timing vs. Selection Strategies

The finance curriculum has more or less covered these topics, but you have not really used them in practice. Until you will, this knowledge is like history quizzes—you think you understand them, you pass an exam, and then you will forget even the basics within a year. You will find out that you only knew this vaguely—bits and pieces. Unless you are enrolled in our ASAM class, I don’t think you should even consider quant finance.

Important Finance Data Sources

After the summer, you should already be familiar with CRSP and Compustat (the gold standards), but there are many many others, too. For example:

Bloomberg
Real-Time providers of stock and other data
TAQ (intra-day tickers)
many many others (futures, international, etc.)

Warning: Many real-time vendors of cheap financial data introduce survivorship bias by dropping stocks that are no longer trading. They also often forget about dividends.

Data Sets

https://www.ivo-welch.info/teaching/asam/asam-summer.html

You should already have created annual stock return data sets, both with and without the first few days of the calendar year.
You should already have created a risk-factor exposure data set.
You should already have combined a Compustat signal with your annual CRSP returns, taking into account proper timing.
You should be able to explain line by line how your code did it to someone else. You will be asked to certify that a fellow random student has finished these tasks within 4 weeks of the start of the quarter.

Please hand in “I, <your name>, hereby confirm that <other name> has coded the above tasks and is capable of explaining and modifying the code. Signed and dated, photograph, and email

Thinking of Stock Return Data

Stocks are an irregular panel of [securities,trading days].
There are too many stock-days to shoehorn them into a regular rectangular data panel.
- January 1974 is a good start year for investing, allowing financial statement and aged data from 1972 and after.
- There were about 50k stocks (not 5k). 23k (14k) trading days from 1926 (1963).
- would be about 10GB for daily returns alone in regular panel.
- but the data is sparse, so it fits into 2GB.
- for easy processing, you need 3-5 times as much RAM as data.
Annual data is much smaller and fairly easy to handle.
- 50 years with about 5000 stocks is only about 100,000 firm-years. Even if every stocks gets 100 bytes, this is 10MB and easily fits in RAM.
You will often need to “transpose” the “matrix”:
- you may want all stocks on a given day [a row?]
- you may want all days for a given stock [a column?]
Notes
- Important: percent changes in price are not rates of return! There are dividends and stock splits.
- CRSP is pretty good at defining what does and does not remain a firm in a merger or name change.

Typical Questions about Stocks

What fraction of stocks disappear every year?
- about 10%. Not necessarily negative, due to buyouts.
What fraction of total marketcap sits in the top 1000 stocks?
- more than 95%.
What will happen if you select your sample ex-post (i.e., market-cap at end of year)?
Does disappearance on CRSP or Compustat create “survivorship” bias?
- usually not, because most delistings are announced ex-ante, plus they offer a delisting rate of return
How should you deal with a (-90%, +1000%) return sequence $10,$1,$10?
- Was this a price recording blip or not?
How do you deal with firms that were not traded for a long time, and then suddenly came back?
- How does CRSP deal with it?
How does stock price data compare to other data? Indexes? Options? Futures? Bonds? Private Equity? Compustat?
- stocks are typically cleaner and, for all practical purposes, marked to market at day’s end.
Should you worry about non-normality?
- Not for stocks, even though the normal distribution is not perfect.
- But be aware of non-representativeness, esp with respect to (hard to estimate) rare events.

ASAM

Fortunately, ASAM is not a high-activity fund
We buy-and-hold roughly for one calendar year.
This can also make data handling easier, too.
- we can work with annual stock returns.
- we can create them in one first pass–as you did in the summer
- we can throw out firms that are too small — focus on top 2,000 stocks but only if selected pre-period. the investment universe will also change each year.
- “exposures” (to be explained soon) are better created from daily data sets.
In the real world, it is more common to work with monthly rates of return, because most funds can rebalance reasonably often. However, because ASAM cannot rebalance for most of the year, it would be silly for us to research strategies pretending that we can.
Working with annual returns also helps buy-and-hold vs. fixed-weight issues
- What is the rate of return on a stock that did +50%, -50%? Would you rather have a stock that did +60%,-50% (avg=5%) or one that did -5%,-5% (avg=0%)?
- It is easier to execute buy-and-hold.
- It is more difficult to program buy-and-hold (unless one uses one return for the entire holding period).

Finance Questions

What is abnormal performance?
- actual minus normal performance
What is “normal” or expected performance?
- shit, we don’t have a great benchmark model.
- presumably, normal performance is primarily risk- and liquidity related, and not idiosyncratic variance that we can diversify away.
What kind of things do you believe we should control for? Stuff that gives us high but not abnormal unusual returns?
- Market exposure!?
- X-costs!?
- Liquidity risk!?
- But what about “value”?
Market exposure is not necessarily the CAPM.
- Could be a simple market model.
- if you have a stock that has a beta of 2 and the market went up by 10%, your stock should go up by 20%.
- the CAPM is related but says something more and different
  - it is about how market-beta should relate to the intercept alpha
  - example: if the CAPM does not hold, a stock with a beta of 3 could well have an alpha of –1% per year (in the past and future). if it does hold, usually one would expect this stock to have an alpha above 0.
How could you have a –8% performance but a +6% abnormal performance?
- your portfolio had a market-beta of 2
- the market (net of rf) dropped 7%
- your portfolio (net of rf) dropped 8%
- and your benchmark model was the CAPM.

Definition of Factor

A factor is a time-series of the rate of return on a zero-investment (self-financed) pfio, formed on the basis of an a-priori known signal or characteristic.

A “characteristic” (sometimes called an attribute) is not a factor, but something that attaches to a firm-month. Example: market cap. IBM’s marketcap was $109.2B in Oct 2018.
A “factor-loading” is not a factor. Factor-loadings are like market-betas. They are specific to stocks, too, just like characteristics.

What is a Factor?

Net-of-rf rate of return on the stock market
Value vs. Growth: HML (high B/M vs low B/M)
Firm Size: SMB (small minus big)
Robust Minus Weak: RMW
Conservative Minus Aggressive: CMA
Momentum: Rate of return from –2 to –12 months.
Reversal: Rate of return in previous month (–1).

Other factors are at Ken French’s Online Data.

What is not A Factor?

Market-Beta
Number of Analysts
Insider Sales for the Company
Most everything from Compustat for a company

Performance Attribution

How could you have a –8% performance but a +6% abnormal performance?
- your portfolio had a market-beta of 2
- the market (net of rf) dropped 7%
- your portfolio (net of rf) dropped 8%
- and your benchmark model was the CAPM.
your total return is the sum of the abnormal return plus the betas times the factor realizations.
- here, abnormal was +6%
- due to market was 2*(--7%)
- and 6% + 2 * (-7%)= -8%.
note that you chose to have stocks with a beta of 2, too!
- this did not fall from heaven
- you could have hedged this market-risk (How?)
- in which case, you could have had a market-beta of 0 and just abnormal = normal = 6% rate of return

Signals

The hedge fund jargon is “signal.” A signal means some numeric value that tells you what to invest in.
To test whether a “signal” is useful, we want to ask:
- in a given month, did stocks that have more signal perform better in terms of rate of return later on?
The signal must be “comfortably” known ahead of time.
- The earnings for 2017 are not known on 1/1/2018!
- (They tend to be known about 4-5 months after the fiscal year end.)
Aggregate statistics are typically not signals, because they are the same for all stocks.
- GDP growth is useless
- even HML is not usually a signal; stock’s exposure to HML with or without HML forecast could be.
Signal Examples
- each stock’s market cap,
- its book/market ratio,
- its investment rate,
- its stock’s beta,
- its sigma,
- its momentum,
- the age of the parents of the CEO
- etc

What Kind of Stock Investment Strategies Are There?

Long Only
- ASAM’s domain
- Implicitly “sort-of-short” by not covering some
Long-Short
- (Often relatively) immune to overall market movements
(Market) Timing
- Move in and out of stocks.
Event-Related
- e.g., earnings drift
- requires a lot of attention

Investment Strategies

An investment strategy is a function that maps (known) signal(s) to an investment (pfio weights). Typically but not always, it is a monotonic function.
Here is a silly strategy:
- If marketcap >$100 million and marketcap <$500 million, and book-to-market ratio (call it BM) is greater than 1.0, buy BM$^2$ dollars worth of shares in this stock (long leg).
- If marketcap >$500 million, and book-to-market ratio is greater than 0.8, short $\sqrt{BM}$ dollars worth of shares in this stock (short leg).
- All others, invest $0.
It is very common to scale investment strategies:
- Scale the pfio to invest $1 long and $1 short, so that it is a zero-investment strategy.
- For example, if your long leg goes from $1 to $1.25 and the short leg from $1 to $1.10, you will have earned $0.15.
- This is not a rate of return, because the net cost was (academically) zero.
- The strategy’s performance will be a lot easier to interpret (see below) if it is a zero-investment portfolio.
PS: A signal is more generic than investment weights.
- You could just call the weights from a zero-investment investment strategy a signal, too. The weights are known.
- But a signal is not necessarily a zero-investment strategy.

What To Predict: The Period-Ahead Rate of Return

The dependent variable is always a rate of return, typically over one month or year, and usually future.
For ASAM, we will make it 1 year, because we do not tinker (much) from mid-January to end-December..
For each stock, we have a lagged (known) signal predicting current return
- the current signal predicting future return.
The independent variables in the regression input can be “signals” or “signal-based pfio investment.” Again, these signals typically must vary stock by stock.
We can subtract the risk-free (or any other aggregate) rate of return from the dependent stock return, and it will not make much if any difference. It simply shifts all stocks the same way up or down a little bit, which does not change the slope that we are interested in.

if $ri-1% = a + b(rm-1%)$, then $ri ~ a + b*rm$.

Testing Methods

There are two basic methods to test for cross-sectional signals/ strategies. This will become clear soon.

[Fama-MacBeth]: These are “pooled averages of coefficients from many (monthly) cross-sectional regressions.” They work directly with signals.
[Fama-French]: This is one time-series regression given a strategy’s returns. The difficulty here is that we must define our strategy first.

I will explain them in the next sessions.

In addition, there is a wrong method, which can be useful because it can make quick-and-dirty exploration simpler:

[One Giant Pooled Regression]

Regress future return on your independent variables (signals).

Do not trust the T-stats—if you want, mentally divide them by 10.
This should never be used for real, only for exploratory work.

Timing

You will stand in late December of this year and have access to financial accounting and stock price data that was filed around September. You will not have access to December data. (You may have access to some later Yahoo finance data, but it will be sporadic.)

So, in your backtests, make sure that you replicate this scenario.