This page is not updated. The ideas here (of how to create a proper placebo data set before testing on real leverage data) are fresh, but the data were not. In particular, the large data files are no longer here. They should be recalculated now anyway.
These files are copyrighted. You must ask for permission before using any of these files. Although I believe that I hold all the copyright (because I calculated the data), I would prefer it if you had a compustat license. I will not grant use permission if you do not subscribe to compustat. If your university does have a compustat subscription, I will grant you a free use license.
Placebo Leverage Data Sets
The data files in this directory make it easy for researchers to test whether capital-structure-related empirical findings derive from the fact that leverage ratios (either debt-to-capital or liabilities-to-assets; either in bookvalue or marketvalue) are ratios that have unusual properties. In
Iliev, Peter, and Ivo Welch, 2010, "Reconciling Estimates of the Speed of Adjustment of Leverage Ratios." Penn State and Brown/NBER.we explain how these leverage ratios are obtained under the null hypothesis (of no managerial interest). Think of the data sets as "placebos," which allow one to check whether findings are spurious. (The relevant part of the paper is only three pages and easy reading.)
A Brief Explanation
Let me try a brief intuitive explanation, anyway. We want to find the evolution of a firm's leverage ratio if actions are non-deliberate. Starting with each firm's actual leverage ratio, we draw a random firm-year from the full sample. This random firm-year gives us a percent change in equity and a percent change in debt. These are applied to the original firm's debt and equity to compute a simulated next-year's leverage ratio under the null hypothesis. Note that it is important that the randomly matching firm-year is drawn without regard to anything---such as the firm's or the match's lagged ratio or even firm year. This is what makes this procedure an excellent simulation of the evolution of leverage ratios under the null hypothesis that managers do not do anything deliberate. (If this is not clear, please read the paper.)
If your own tests find variables to be significant when tested on these random data sets, then you know that your variables cannot measure anything that managers may have deliberately chosen. (Usually, this means that these coefficients are spurious.) By comparing magnitudes under the actual and these random data sets, you can also suitably adjust your coefficient estimates quantitatively.
The files are categorized by the leverage ratio that is evolving:
|md2c||market-value based debt-to-capital|
|bd2c||book-value based debt-to-capital|
|ml2a||market-value based liabilities-to-assets|
|bl2a||book-value based liabilities-to-assets|
Capital is debt plus equity. (And if you are thinking of asking me for similar simulated files for financial-debt-divided-by-assets, please read Ivo Welch, "A Bad Measure of Leverage: The Financial-Debt-To-Asset Ratio" on SSRN.)
Each file contains 10 random data sets, in .csv format but compressed via gnu zip. Most general decompression programs under Windows and OSX should have no problems uncompressing them. The csv format itself is simple:
"","gvkey","fyear","ratio","Lratio","matched" "1",1000,1970,0.43344,0.35213,"..." "2",1004,1967,0.21041,0.13792,"..." "3",1005,1974,0.40527,0.50775,"..." "4",1011,1983,0.41533,0.19354,"..." ...
Lratio stands for "lag ratio". The Compustat gvkey and fyear codes make it easy to merge our data sets with your own data set. You can ignore the matched column—they tell you what other firm-year was used to perturb an observation.
Each directory contains the actual empirical data set, a version thereof that eliminates leverage observations after a missing year (without a leverage observation), and random data. The placebo data sets are obviously named. For the md2c and bd2c directories, there are also "-zerofromzero" data sets, which assume that a firm that had a lagged zero debt ratio behaves like other firms with zero debt ratio. This breaks the rule that the random firm-year draw should be without regard to the firm's own historical information. Use only as a check and only with caution.
Typically, with over 100,000 firm-years, your estimates will have standard errors that are tiny. Put differently, whatever results you will find in one placebo data set will likely be almost the same as what you will find using another placebo data set. If you just try your estimator on one of these placebo data sets, you will have a very good idea of how your estimator behaves. Don't believe me? Try it out!
Notes on Debt Ratios
Defining leverage ratios is trickier than most authors realize. Worse, assumptions that seem merely for convenience can have real impacts on results. To see my choices, look at mksane.R. Here is a description of the most important ones:
- Debt is dlc + dltt (compustat codes for financial debt in current liabilities and long-term financial debt). Liabilities are lt. The difference are non-financial liabilities, which are typically as large as financial liabilities. (This is also why debt/assets is badly flawed as a measure of leverage—an increase in non-financial liabilities decreases debt/assets.)
- Debt defined this way should never be negative, but there are some errant compustat observations. So, set any negative debt to NA.
- The market value of equity (which I call meq) is seq (stockholder's equity) minus ceq (book common equity) plus the market value of equity. It should always be positive.
- The book value of equity can be negative. In this case, all hell can
break loose in later calculations that are book-ratio based—and yes,
this is significant enough a problem to affect inference. We do not want to
throw out these firm-years wholesale (they are after all firm-years in which
equity was very low), but we can also not use them without reservations. So,
now do the following:
- If in addition to equity, the debt is also zero, consider this firm-year insane and set it to NA.
- Set the book value of equity to be the largest value of 0, 1% of the firm's debt, 0.1% of the firm's assets, or the reported book value.
- d2c is debt divided by the sum of debt plus equity. l2a is total liabilities (lt) divided by total liabilities plus equity. (Note: l2a is not total liabilities divided by assets, because there is also minority interest.) Market-based ratios use meq; book-value based ratios use seq. (I strongly prefer seq, but not all researchers will agree.)
- All leverage ratios are winsorized at 0.999 (i.e., affecting cases in which equity would be 0).
NOTE: I (Ivo Welch) wrote the computer code that generated these R files. (The R files are also in this directory and downloadable, but they are probably incomprehensible to anyone.) Peter Iliev wrote the code that was used in the original paper. The results were independently replicated, and I hope that there are no errors in the files, but there are no ironclad guarantees, especially for the first few users. So, be aware that you are guinea pigs! My first advice: check your own leverage ratios against the *-empirical.csv file to make sure they match your dependent variable *before* you start using the placebo.csv files. Check that your placebo simulated file has the same number of observations.
Wei Wang from UNO helped me tremendously in creating these files. I would have posted incorrect files without his help. I would give him full credit if this was an academic paper.
Please drop me an email to let me know how well these files (and this explanation) work for you.