How to go from signals to investment strategies?
How can you map signals into a (zero-investment) portfolio’s investment weights?
Signals are often logs or ranks of some other variable (e.g., marketcap).
It is often useful to make both the long leg and the short leg each $1 for easier interpretation and comparison.
Example: How do you form a portfolio that is long small stocks and short big stocks?
Relatively Easy: Sort stocks by marketcap (sz)
This should be done in each time-unit separately, and of course based on ex-ante known signal information.
Equal-weighted
Invest -$1/n in each stock in the bottom x-tile of the signal
Invest +$1/n in each stock in the top x-tile of the signal
x could be decile, quintile, tertiale, percentile
n is number of stocks in each x-tile (here, we are equal weighting)
Long leg are all stocks with signal above median
Short leg are all stocks with signal below quartile
Sum up all the weights of the signals of stocks in the long leg. Call this SW.
Invest sw_i/SW in all stocks in the long leg.
Sum up all the weights of the signals of stocks in the short leg. Call this SW.
Invest -sw_i/SW in all stocks in the short leg.
Say we have two signals, value and size
Let’s say larger firms and value firms produce high abnormal rates of return.
BJSFF is not natively designed for such cases
For a complete example, look at the world simulation
BJSFF pushes the task of distinguishing dimensions on you…and this can be a pain.
It is easiest to understand how to do this by working an example.
The key question is usually “if both aspect A and aspect B predict future alpha, then which one matters?”
For example, you may want to know are future returns more due marketcap (e.g., three size tertiales, S,M,B) or due to value (e.g., three tertiales, H,A,L)?
Categorize each stock by (lagged known) size and value. Create nine portfolios. For each portfolio, calculate the alpha by itself.
Dim | S | M | B | Avg |
---|---|---|---|---|
H | \(\alpha_{SH}\) | \(\alpha_{MH}\) | \(\alpha_{BH}\) | \(\alpha_{H}\) |
A | \(\alpha_{SA}\) | \(\alpha_{MA}\) | \(\alpha_{BA}\) | \(\alpha_{M}\) |
L | \(\alpha_{SL}\) | \(\alpha_{ML}\) | \(\alpha_{BL}\) | \(\alpha_{L}\) |
Avg | \(\alpha_{S}\) | \(\alpha_{M}\) | \(\alpha_{B}\) | \(\alpha\approx0\) |
If you see spread in average alphas only along the vertical dimension but not the horizontal dimension, then what matters is H-M-L.
If you see spread in average alphas only along the horizontal dimension but not the vertical dimension, then what matters is S-M-B
Problem: What if size and value are correlated? All stocks may be bunched in the same cells.
This is a sophisticated approach to create one test portfolio that holds one dimension fixed and spreads another dimension. It usually leads to more efficient portfolio tests.
Example 10 Stocks, A-J, 2 Characteristics, SZ and VAL:
d <- data.frame(
nm= letters[1:10],
sz= c( 54 , 35 , 36 , 6 , 63 , 4 , 64 , 5 , 96 , 16 ),
val= c( 8.4 , 5.5 , 3.0 , 4.6 , 7.0 , 1.4 , 3.9 , 7.6 , 2.4 , 1.2 )
)
d
## nm sz val
## 1 a 54 8.4
## 2 b 35 5.5
## 3 c 36 3.0
## 4 d 6 4.6
## 5 e 63 7.0
## 6 f 4 1.4
## 7 g 64 3.9
## 8 h 5 7.6
## 9 i 96 2.4
## 10 j 16 1.2
The size and value signals are positively correlated in this example: .
Their descriptive statistics are:
p(iaw$summary(d))
d <- d[order(d$sz),]
p(d)
d <- d[-c(5),]
p(d)
short <- long <- NULL
g <- d[1:3,]
g <- g[order(g$val),]
p(g)
## remember lowest and highest g by val
short <- rbind(short, g[1,]); long <- rbind(long, g[3,])
g <- d[4:6,]
g <- g[order(g$val),]
p(g)
short <- rbind(short, g[1,]); long <- rbind(long, g[3,])
g <- d[7:9,]
g <- g[order(g$val),]
p(g)
short <- rbind(short, g[1,]); long <- rbind(long, g[3,])
p(short)
p(long)
Just stare at this:
val
averageval
average.sz
average in both the long and the short leg is around 40.print( colMeans(short[,2:3]) )
print( colMeans(long[,2:3]) )
## sz val
## 38.667 1.667
## sz val
## 40.667 7.667
sz
and val
were:myscale <- c( szsd= sd(d$sz), valsd= sd(d$val) )
print( myscale )
cat("The difference of ", mean(long$sz - short$sz)," in sz is ", mean(long$sz - short$sz) / myscale[1], "standard deviations\n")
cat("The difference of ", mean(long$val - short$val)," in val is ", mean(long$val - short$val) / myscale[2], " standard deviations\n")
## szsd valsd
## 32.935 2.704
## The difference of 2 in sz is 0.06073 standard deviations
## The difference of 6 in val is 2.219 standard deviations
Voi-La: We have two portfolios, each with similar numbers of stocks, that are similar in sz
, but different in val
.