David Levy and Susan Feigenbaum worried a lot about this in "The technological obsolescence of Scientific Fraud". Where investigators have preferences over outcomes, it's possible to achieve those outcomes through appropriate use of identifying restrictions or method - especially since there are lots of line calls in which techniques to use in different cases. They note that outright fraud makes results non-replicable while biased research winds up instead being fragile - the relationships break down when people change the set of covariates, or the time period, or the technique.
Note that none of this has to come through financial corruption either: simple publish-or-perish incentives are enough where journals are more interested in findings of significant than of insignificant results; DeLong and Lang jumped up and down about this twenty years ago. Ed Leamer made similar points even earlier (recent podcast). And then there's all the work by McCloskey.
Thomas Lumley today points to a nice piece in Psychological Science demonstrating the point.
The lesson isn't radical skepticism of all statistical results, but rather a caution against overreliance on any one finding and an argument for discounting findings coming from folks whose work proves to be fragile.
In this article, we accomplish two things. First, we show that despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings ( ≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process.Degrees of freedom available to the researcher make it "unacceptably easy to publish "statistically significant" evidence consistent with any hypothesis." They demonstrate it by proving statistically that hearing "When I'm Sixty-Four" rather than a control song made people a year-and-a-half younger.
The lesson isn't radical skepticism of all statistical results, but rather a caution against overreliance on any one finding and an argument for discounting findings coming from folks whose work proves to be fragile.
Geen opmerkingen:
Een reactie posten