research / meaning of cites

I want to acknowledge that my opinion is colored by the fact that my cite counts are pretty good. [Although my cite metrics are in the same logarithmic league as those of some past American Finance Association presidents (and even a few Nobel Prize winners), they are one full order of magnitude lower than those of, say Eugene Fama (>25,000) or Andrei Shleifer (>30,000).] A good cite count contributes to the fact that I may like them more than others whose cite counts look worse.

Cites are not everything. They measure intellectual impact, not something else (like depth, paper quality, collegiality, etc.). As a measure for scholarly performance, I think they are hard to beat, if only because the alternatives are even worse.

Alternative Scholarly Performance Metrics

"Read the Papers": It is often suggested that, when evaluating scholarly contributions, we should read the papers instead, and thus come to our own conclusions. Alas, I find "reading papers" to be so subjective that it is almost meaningless as an objective metric. Barring obvious errors and mistakes (and no paper is entirely flawless), who is to say that what you find interesting, I find interesting, and vice-versa? On the contrary. I have found that we disagree about which papers are good and which are bad; which are interesting and which are not. This is not to claim that I do not find some papers better and others worse. I very much do. But I find my choices are different from my colleagues, and it is often difficult to convince others to share my opinions. [One part I decidedly do not like about the "reading process" is that algebra is then often used as one of the few metrics of objective paper "quality" and "depth." This makes no sense to me. We are economists, not algebraists.]

External Letters seems completely idiotic to me. We are asking someone with no stake in our institution to tell us whether our institution should tenure a person. What are the incentives of this person? And how many do you have? And how often do you receive negative letters? Do they say more about the writer or more about the subject? It is also laughable that we judge external letters themselves to be good and deep if they regurgitate synopses of the papers to signal knowledge. We should already know the papers. What we do not know is the subjective assessment, especially relative to peers.

Publication journal quality seems like a reasonable metric, with pluses and minuses.

Assignment in reading lists in PhD classes at other universities seems like a reasonable metric, with pluses and minuses.

External marketability seems like a reasonable metric, with pluses and minuses. (This is especially the case for business school disciplines, where high salaries relative to the humanities and sciences can be justified only by external marketability, and not by fairness or intellectual worthiness. It is hipocrisy to argue that business school faculty should be paid more dearly than thoese elsewhere, because our market opportunities are higher (convenient); but then argue that schools should not consider marketability on individuals [them; inconvenient].)

Cites: Thus, the evaluation of cite metrics for intellectual impact is, in my mind, the analog of Churchill's quote that democracy is the worst form of government, except for all the others.

Cite metrics have flaws, except that all other halfway-objective scholarly performance metrics in economics that I know of have even worse flaws. Thus, the most reasonable assessment I can think of is a weighted average of multiple badly flawed metrics, with cites playing a major but not an exclusive role.

Reasonable Cite Use

One can not compare cite counts from Google Scholar (GS) with those of Web of Science (WoS). They are both interesting, but not comparable.

GS cite counts are always much higher.
They measure the number of cites accrued by scouring working papers and conference volumes. If paper A cites paper B, and paper A is presented in 30 conferences, then B receives 30 cites. This may be the case even if A is never published.

GS is relatively more accurate judging among recent working papers that have not yet been published and have not yet had much chance to acquire published citations. (Another indicator of eventual impact is the journal rank. Beyond a forecast of impact, journal rank seems fairly unimportant to me.)

The ability of authors to create your own profile for public consumption is great. This is especially useful for economists with common names (such as Chinese authors).
WoS (and Scopus) cite counts are always much lower than GS cite counts. WoS' data bases are well curated and much more accurate than GS's, although not perfect. For example, for a long time, they collected only first initial, not not full first names. (Is "I Welch" "Ivo Welch" or "Ian Welch"?). They also have misspellings. This happens more often if your name is "Welch" (because the citing articles may cite "Welsh" or "Welsch") or Bikhchandani (or Bikhchandhani?) Finally, in the past, they attributed paper cites only to the first author. This sucks if author order is usually alphabetic (as it is in economics) and one's name is Zingales.
WoS provides the best data for aged papers. If the paper was published 10-20 years ago, and you have (expensive) access to WoS, then please ignore GS.

Comparability, Flaws, Voting, and Weights: There are many easily correctable weaknesses of GS and WoS cites. First, cite engines should not just sum cite numbers. What I mean is that they should normalize citing links. If Paper A cites 3 papers (among them C) and Paper B cites 300 papers (among them D), currently both C and D receive 1 cite each. If papers lived in a democracy, A would thus have 3 votes to hand out, and B would have 300 votes to hand out. More accurately, C should have received 1/3 cite and D should have received 1/300 cites. Second, if paper E is famous, and paper F is not, a cite from E should matter more than one from F. This is easy to do: it is just the first principal component (or Google Pagerank if you wish).

Area Comparability: Areas are difficult to compare, and the just mentioned two issues further make it difficult. But there is more. Think of applied econometrics. Standard error corrections (White, etc.) have received large number of cites, but what importance should we attribute to them? The answer is less obvious than it seems. They are indeed important. But perhaps not thousands of times more influential and important. Now compare, say, growth economics and multiple-equilibrium (sunshine) theory. Growth economics papers tend to cite ten times as many papers (for more votes) than other economics papers; and there are ten times as many economists working on growth than on sunshine theory. Cite counts are 100x higher for growth economics. But we should not just accept cite counts 100 times lower for the former. Is there a reason why there are more growth economists than sunshine theorists? If you have the world's best theorist on X, with twice as many cites as anyone else in X, and this scholar has 16 lifetime cites (log 3) over 10 years, what do you conclude? (I have heard arguments that when we "read the papers" (see above), we could predict that the research would be published and influential in the future. Really?! I cannot judge this, and I have seen many papers in my life.)

Stupid Cite Uses

And then there are the outright stupid uses of cites. Two immediately come to mind. First, the h-index. Second, RePeC.

For the h-index, recognize that h is non-monotonic in the number of cites. I would rather have one paper with a grand total of 1 million cites (h=1), than two papers with a grand total four cites (h=2). Which author has had more intellectual impact?
RePeC should never be used, not only because the input data base is weak, but because its (overall) economist metrics are idiotic. I regret to say this, because I admire that RePeC is an open-source project. RePec scrambles any reasonable relative comparisons beyond the first 20 authors. First, their overall ranking uses many rank metrics based on h-index variations. OK. Bad enough. But then they add "pedigree" rankings. This input is really just based on the professional network (links). It is interesting but not an intellectual accomplishment that should flow into economic rankings in my mind.