The Limits of Research

Jonah Lehrer’s dazzling essay, The Decline Effect and the Scientific Method, has been the most frequently e-mailed article from the New Yorker since it was published last month. This pleases me a lot. For Lehrer describes one of the most serious problems associated with research in the social sciences, especially research in psychology and, as he points out, the natural sciences as well.

In a nutshell the problem is the difficulty of replicating experimental findings. A derivative problem is the frequently observed decline in the overall effect of any particular variable(s) with repeated testing. Replication is the heart of scientific research. That is the purpose of making the findings public—so that investigators in other labs can see if they can observe the same outcome. He writes,

“Different scientists in different labs need to repeat the protocols and publish their rests. The test of replicability, as it is know, is the foundation of modern research. Replicability is how the community reinforces itself.”

Lehrer cites many examples of the failure to replicate. He notes how often the therapeutic power of a drug wanes on repeated testing and then eventually disappears or turns out on analysis to be largely a function of the placebo effect. He discusses at some length the difficulty of replicating the phenomenon of verbal overshadowing—how the act of describing memories seems or seemed at one time, to interfere with recall.

And he cites the well-known failure to replicate the early studies by Joseph Rhine on extrasensory perception (ESP) that turned out to be a statistical fluke. It was Rhine who first coined the term “the decline effect.”

It has even been observed in a series of studies where the same genetic strains of mice were shipped to three different labs on the same day from the same supplier and raised under identical conditions, including the way they were handled. The investigator who reported the widely different results from each lab concluded that a lot of scientific data are nothing but noise. “The end result is a scientific accident that can take years to unravel.”

What accounts for the fact that so many research studies cannot be replicated? Several factors are at work. Proper control conditions may have been omitted from the original experiments, the samples may not have been randomly selected or consist of a highly uniform, unrepresentative group of individuals, usually college sophomores. Or the results may have occurred because of experimenter biases that inevitably led to evidence supporting the hypothesis. Few experimenters really design studies to disprove, rather than confirm their hypothesis. This is a point Karl Popper emphasized many years ago.

Then there is the publication biases characteristic of most scientific journals. Researchers who do not report positive outcomes cannot get their findings published. According to one study, ninety-seven percent of psychology studies proved their hypothesis. We know this can’t be the case. As one investigator (Richard Palmer, a biologist) noted, “Once I realized that selective reporting is everywhere in science, I got quite depressed.” And he continued:

“We cannot escape the troubling conclusion that some—perhaps many cherished generalities are at best exaggerated in their biological significance and at worst a collective illusion nurtured by strong a-priori believes often repeated.”

Lehrer has written a powerful, persuasive essay on the limits of scientific research. It is the kind of essay I wish I had written, as I have been aware of this for years based on a lifetime of reading research in psychology, as well as my own studies. I have been guilty of looking for data that confirms my hypothesis, interpreting outcomes in a way that will insure their publication, and no doubt biasing (conservatively) the design of my studies without really trying to test (falsify) them.

But I didn’t write the essay and no doubt could never have done so with the skill and breadth that Lehrer has. I commend him for his piece, as well as the New Yorker for publishing it and thereby making possible its wide dissemination.


Stefanie said...

This reminds me a little bit of quantum physics where once you observe a particle it has been changed so you can never truly get an exact measure - probability theory and shrodinger's cat and all that. I suppose there is no similar phenomena in social sciences that can account for why it is so difficult to replicate studies?

Richard Katzev said...

Yes, I think there is. There is measurement error in all sciences. There are procedural differences, often subtle between experiments. There are biases and experimenter effects that also account for failures to replicate. And then there is the weather, the clouds, the wind that are always doing their work.