« A return to base. | Main | TravelBlog: Making the summit of Longs Peak »

August 30, 2005

Reliability in the currency of ideas

The grist of the scientific mill is publications - these are the currency that academics use to prove their worth and contributions to society. When I first dreamt of becoming a scientist, I rationalized that while I would gain less materially than certain other careers, I would be contributing to society in a noble way. But what happens to the currency when its reliability is questionable, when the noblesse is in doubt?

A recent paper in the Public Library of Science (PLoS) Medicine by John Ioannidis discusses "Why most published research findings are false" (New Scientist has a lay-person summary available). While Ioannidis is primarily concerned with results in medicine and biochemistry, his criticism of experimental design, experimenter bias and scientific accuracy likely apply to the broad range of disciplines. In his own words,

The probability that a research claim is true may depend on the study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field.

Ioannidis argues is that the current reliance upon the statistical significance p-value in only one direction, i.e., is the chance that the observed data is no different than the null hypothesis measured to be less than some threshold (typically, chance less than 1 in 20), is a dangerous precedent as it ignores the influence of research bias (from things such as finite-size effects, hypothesis and test flexibility, pressure to publish significant findings, etc.). Ioannidis goes on to argue that scientists are often careless in ruling out potential biases in data, methodology and even the hypotheses tested, and that replication by independent research groups is the best way of validating research findings as they constitute the most independent kind of trial possible. That is, confirming an already published result is at least as important as the original finding itself. Yet, he also argues that even then, significance may simply represent broadly shared assumptions.

... most research questions are addressed by many teams, and it is misleading to emphasize the statistically significant findings of any single team. What matters is the totality of the evidence. Diminishing bias through enhanced research standards and curtailing of prejudices may also help. However, this may require a change in scientific mentality that might be difficult to achieve.

In the field of complex systems, where arguably there is a non-trivial amount of pressure to produce interesting and, pardon the expression, universal results, Ioannidis's concerns seem particularly relevant. Without beating the dead horse of finding power laws everyone you look, shouldn't we who seek to explain the complexity of the natural (and man-made) world through simple organizing principles be held to exacting standards of rigor and significance? My work as a referee leads me to believe that my chosen field has insufficiently indoctrinated its practitioners as to the importance of experimental and methodological rigor, and of not over-generalizing or over-stating the importance of your results.

Ioannidis, J. P. A. (2005) "Why most published research findings are false." PLoS Med 2(8):e124

posted August 30, 2005 10:13 AM in Simply Academic | permalink