Translating Signals Into Investment Strategies

How to go from signals to investment strategies?

Introduction

How can you map signals into a (zero-investment) portfolio’s investment weights?
Signals are often logs or ranks of some other variable (e.g., marketcap).
For one signal, this is “relatively” easy (compared to two signals, below):
- Sort all the stocks by the signal
- Go long more in the stocks with the highest signals (long leg)
- Go short more in the stocks with the lowest signals (short leg)
- or vice-versa
You can weight stocks within both portfolio legs:
- equal-weighted
- marketcap-weighted
- signal-weighted
- etc.
It is often useful to make both the long leg and the short leg each $1 for easier interpretation and comparison.
Your investment strategy should be very close to what you will be able to actually invest later.
- For example, if you can only go long 12 stocks, then your (backtesting) strategy should only go long 12 stocks.
If you can only go long and not go short, you should use the risk-free rate as your short leg
- this is for your dependent variable (the rate of return on your test portfolio).
- the factors (independent variables) must all be zero investment, too. This is why the market factor is always net of the risk-free rate, too.

One Signal (Simple Sorting)

Example: How do you form a portfolio that is long small stocks and short big stocks?
Relatively Easy: Sort stocks by marketcap (sz)
This should be done in each time-unit separately, and of course based on ex-ante known signal information.

Common Example 1:

Equal-weighted
Invest -$1/n in each stock in the bottom x-tile of the signal
Invest +$1/n in each stock in the top x-tile of the signal
x could be decile, quintile, tertiale, percentile
n is number of stocks in each x-tile (here, we are equal weighting)

Crazy Example 2:

Long leg are all stocks with signal above median
Short leg are all stocks with signal below quartile
Sum up all the weights of the signals of stocks in the long leg. Call this SW.
Invest sw_i/SW in all stocks in the long leg.
Sum up all the weights of the signals of stocks in the short leg. Call this SW.
Invest -sw_i/SW in all stocks in the short leg.

Two Signals (Double-Sort)

Say we have two signals, value and size
Let’s say larger firms and value firms produce high abnormal rates of return.
But larger stocks also tend to be value stocks.
- Which one really matter??
- Which one is merely coincidental?? (correlated)
Fama-Macbeth makes it easy to test the importance of one characteristic relative to another
- OLS is designed for sorting out the importance of competing variables
BJSFF is not natively designed for such cases
For a complete example, look at the world simulation

Simple Cross-Tabulation Method

BJSFF pushes the task of distinguishing dimensions on you…and this can be a pain.
It is easiest to understand how to do this by working an example.
For example, how do you form a portfolio that is spread in VAL (long value and short growth), while holding marketcap (SZ) constant?
- Within a given month, sort all stocks by the control (here, SZ) first.
- Within each set of (say) four stocks in order, assign the one that has the lowest marketcap to Portfolio1 and the one that has the highest marketcap to Portfolio2. Do this for the entire cross-section.
The key question is usually “if both aspect A and aspect B predict future alpha, then which one matters?”
For example, you may want to know are future returns more due marketcap (e.g., three size tertiales, S,M,B) or due to value (e.g., three tertiales, H,A,L)?
Categorize each stock by (lagged known) size and value. Create nine portfolios. For each portfolio, calculate the alpha by itself.

Dim	S	M	B	Avg
H	$\alpha_{SH}$	$\alpha_{MH}$	$\alpha_{BH}$	$\alpha_{H}$
A	$\alpha_{SA}$	$\alpha_{MA}$	$\alpha_{BA}$	$\alpha_{M}$
L	$\alpha_{SL}$	$\alpha_{ML}$	$\alpha_{BL}$	$\alpha_{L}$
Avg	$\alpha_{S}$	$\alpha_{M}$	$\alpha_{B}$	$\alpha\approx0$

The choice of cutoffs is up to you
- You could use fixed cutoffs
- You could use cutoffs that reduce the middle
- You could do tertiales, quintiles, or deciles
- etc.
If you see spread in average alphas only along the vertical dimension but not the horizontal dimension, then what matters is H-M-L.
If you see spread in average alphas only along the horizontal dimension but not the vertical dimension, then what matters is S-M-B
Problem: What if size and value are correlated? All stocks may be bunched in the same cells.

Double-Sorting Example

This is a sophisticated approach to create one test portfolio that holds one dimension fixed and spreads another dimension. It usually leads to more efficient portfolio tests.
Example 10 Stocks, A-J, 2 Characteristics, SZ and VAL:

d <- data.frame(
   nm= letters[1:10],
   sz= c( 54 , 35 , 36 , 6 , 63 , 4 , 64 , 5 , 96 , 16 ),
   val= c(  8.4 , 5.5 , 3.0 , 4.6 , 7.0 , 1.4 , 3.9 , 7.6 , 2.4 , 1.2 )  
   )
d

##    nm sz val
## 1   a 54 8.4
## 2   b 35 5.5
## 3   c 36 3.0
## 4   d  6 4.6
## 5   e 63 7.0
## 6   f  4 1.4
## 7   g 64 3.9
## 8   h  5 7.6
## 9   i 96 2.4
## 10  j 16 1.2

The size and value signals are positively correlated in this example: .

Their descriptive statistics are:

p(iaw$summary(d))

Sort All Stocks by SZ

Sort all stocks by SZ:

d <- d[order(d$sz),]
p(d)

Because we want groups of 3 here, drop one of them…ideally, a middle one.

d <- d[-c(5),]
p(d)

Work Group By Group

Group 1: VAL of F (1.4) is smallest, VAL of H (7.6) is largest. Ignore D.

short <- long <- NULL

g <- d[1:3,]
g <- g[order(g$val),]
p(g)
   ## remember lowest and highest g by val
short <- rbind(short, g[1,]); long <- rbind(long, g[3,])

Group 2: VAL of J (1.2) is smallest. VAL of A (8.4) is highest. Ignore C.

g <- d[4:6,]
g <- g[order(g$val),]
p(g)
short <- rbind(short, g[1,]); long <- rbind(long, g[3,])

Group 3: VAL of I (2.4) is smallest, VAL of E (7.0) is largest. Ignore G.

g <- d[7:9,]
g <- g[order(g$val),]
p(g)
short <- rbind(short, g[1,]); long <- rbind(long, g[3,])

Collect The Results

p(short)
p(long)

Just stare at this:

The stocks in the short leg now have a low val average
The stocks in the long leg have a high val average.
The sz average in both the long and the short leg is around 40.

print( colMeans(short[,2:3]) )
print( colMeans(long[,2:3]) )

##     sz    val 
## 38.667  1.667 
##     sz    val 
## 40.667  7.667

Let’s put these quantities in perspective. The sd’s of sz and val were:

myscale <- c( szsd= sd(d$sz), valsd= sd(d$val) )

print( myscale )

cat("The difference of ", mean(long$sz - short$sz)," in sz is ", mean(long$sz - short$sz) / myscale[1], "standard deviations\n")
cat("The difference of ", mean(long$val - short$val)," in val is ", mean(long$val - short$val) / myscale[2], " standard deviations\n")

##   szsd  valsd 
## 32.935  2.704 
## The difference of  2  in sz is  0.06073 standard deviations
## The difference of  6  in val is  2.219  standard deviations

Voi-La: We have two portfolios, each with similar numbers of stocks, that are similar in sz, but different in val.
This “double-sort” is relatively ``easy’’ to program
- In each month, sort data frame on SZ first, then loop over groups.

Real-World Example

Welch (SSRN, 2018), Leverage

Simulated Example

Welch

Dim	S	M	B	Avg
H	\(\alpha_{SH}\)	\(\alpha_{MH}\)	\(\alpha_{BH}\)	\(\alpha_{H}\)
A	\(\alpha_{SA}\)	\(\alpha_{MA}\)	\(\alpha_{BA}\)	\(\alpha_{M}\)
L	\(\alpha_{SL}\)	\(\alpha_{ML}\)	\(\alpha_{BL}\)	\(\alpha_{L}\)
Avg	\(\alpha_{S}\)	\(\alpha_{M}\)	\(\alpha_{B}\)	\(\alpha\approx0\)

Translating Signals Into Investment Strategies

Ivo Welch

5/7/2019

Introduction

One Signal (Simple Sorting)

Common Example 1:

Crazy Example 2:

Two Signals (Double-Sort)

Simple Cross-Tabulation Method

Double-Sorting Example

Sort All Stocks by SZ

Work Group By Group

Group 1: VAL of F (1.4) is smallest, VAL of H (7.6) is largest. Ignore D.

Group 2: VAL of J (1.2) is smallest. VAL of A (8.4) is highest. Ignore C.

Group 3: VAL of I (2.4) is smallest, VAL of E (7.0) is largest. Ignore G.

Collect The Results

Real-World Example

Simulated Example