Wednesday, November 22, 2006

The mismeasure of merit

Fielding error
From Moneyball to The Blind Side:

The educational gospel according to Michael Lewis

A MoneyLaw series
The long-awaited release of the 2006 updates to John Doyle's law review citation database, including a comprehensive methodological overhaul, has sparked a flurry of commentary within the Jurisdynamics Network and the larger world of law-related blogging. On this forum and at Tax Law Prof, Paul Caron offers a comprehensive assessment of the Doyle database's new "combined" measure (which blends impact factor and raw citations). Paul also ponders the discrepancies between Doyle's rankings and the U.S. News rankings. Dan Markel of PrawfsBlawg takes aim at journals not addressed by Paul, especially Harvard's high-impact "secondary" journals. Even First Movers, this forum's student-written affiliate, has addressed the issue, albeit more skeptically.

John Doyle's magnificent database, already the subject of much admiration at MoneyLaw, represents a matter of fascination for this forum's contributors. The new combined measure pays homage to the work of MoneyLaw contributor Ronen Perry. MoneyLaw's own Al Brophy is perhaps the greatest proponent of the idea that law review rankings can and should serve as a measure of law school quality.

For my part, John Doyle's new numbers will require me to revise the raw data underlying my paper, Modeling Law Review Impact Factors as an Exponential Distribution. The basic mathematical properties of law review impact factors, however, remain intact -- and seductively beautiful. I exhort you to download my paper: you will rarely see such tight goodness-of-fit results, let alone anything as stunning as the Gini coefficient of the distribution model for law review impact factors.

John SmoltzAlthough Paul, Ronen, Al, and I will undoubtly revisit this issue in greater depth in future posts, the release of the Doyle database's new numbers provides an apt occasion for pondering precisely why law review citation statistics matter to MoneyLaw and the search for ways to win this unfair academic game. Baseball, not surprisingly, provides a basis for comparison and perhaps even the inspiration for a new set of measurements of academic merit.

Existing measures of academic rank are notoriously subjective. In its purest, most extreme manifestation, MoneyLaw rejects any "academic ratings system [that] depends in whole or in part on a subjective survey of academic reputation." Likewise, baseball is saddled with any number of misleading statistics. But none may be more misleading than the fielding error.

Official scoring based on errors is doubly destructive. First, it fails to assess errors for obvious mental mistakes, such as the failure to cover a base. Second, the very act of assigning an error punishes the fielder who has done half or more of his work. Only a nimble and alert player can come close enough to the ball in order to be assigned an error.

Voros McCrackenBaseball not only fails to measure errors properly; it has no official way of filtering out superlative pitching from mediocre fielding. Sabermetrician Voros McCracken has discovered that Major League pitchers have almost no control over the rate of hits against them on balls hit into play. Except when batted balls become home runs (which themselves must be considered in light of park-specific differences), whether a ball hit into play becomes a hit or an out depends on fielding rather than pitching prowess.

Thus arose McCracken's signature contribution to sabermetrics: defense independent pitching statistics. These so-called DIPS measure a pitcher's effectiveness based strictly on plays that do not involve fielders: home runs, strikeouts, walks, and hit batters. DIPS are powerful; smart general managers in baseball know to place more trust in DIPS than in conventional, fielding-influenced measures such as earned run average.

The quest, therefore, is to find the academic equivalent of DIPS. Surely there exists some combination of statistics that can measure differences in actual performance among law schools, while minimizing the baneful and distorting effects of subjective judgments regarding merit. It may be premature, but I like the idea of Reputation Independent Performance Statistics, or RIPS. Law review impact factors and citation counts, to be sure, operate under the heavy influence of academic reputation. But unlike the U.S. News and Leiter rankings, law review impact factors and citation counts do not depend on some sort of survey. They are incredibly hard to influence, let alone to "fake," absent an extraordinary (and therefore laudable) scholarly effort by a faculty. Those who object to treating these measures as "reputation independent" might be willing to accept a less ambitious description, such as "minimally subjective," or "minsub" for short. These are strictly questions of nomenclature. The bottom line remains the same. Until someone persuades me to the contrary, I am prepared to declare that some metric based on law review impact factors and/or citation counts should be included in any effort to compute RIPS, or Reputation Independent Performance Statistics, in legal academia.


Post a Comment

<< Home