Wednesday, August 25, 2010

Paper: Many climate science papers misuse statistics

A paper published today in the Journal of Climate finds that "a large fraction" of papers in the climate science literature misuse tests of statistical significance. While the author did not examine any of the repeatedly debunked tests of statistical significance in the hockey stick literature of Michael Mann & coauthors, he
"tested a recent, randomly selected issue of The Journal of Climate for at least one such misuse of significance tests in each article. The Journal of Climate was not selected because it is prone to include such errors but because it can safely be considered to be one of the top journals in climate science. In that particular issue we observed a misuse of significance tests in 14 out of 19 articles. A randomly selected issue of ten years before showed such misuse of significance tests in 7 out of 13 articles. These two samples perhaps would not pass a traditional significance test, but they do indicate that such errors occur in the best journals with the most careful writing and editing. Indeed, in one of this author’s papers such erroneous use occurred."
So, 74% of the articles in a recent issue of a top climate science journal misused tests of statistical significance, compared to only 54% of articles in an issue from 10 years before. Thus, one might surmise that there is a trend of unprecedented, record high misuse of statistics in the field of climate science. As stated by Edward Wegman, PhD in mathematical statistics,
"As statisticians, we were struck by the isolation of communities such as the paleoclimate community that rely heavily on statistical methods, yet do not seem to be interacting with the mainstream statistical community. The public policy implications of this debate are financially staggering and yet no apparently independent statistical expertise was sought or used."
And also well stated by the Clive Crook article in Atlantic Monthly,
"Climate scientists lean very heavily on statistical methods, but they are not necessarily statisticians. Some of the correspondents in these emails appear to be out of their depth. This would explain their anxiety about having statisticians, rather than their climate-science buddies, crawl over their work."
Wikipedia: "Lies, damned lies, and statistics" is a phrase describing the persuasive power of numbers, particularly the use of statistics to bolster weak arguments...

Significance Tests in Climate Science
Maarten H. P. Ambaum, Department of Meteorology, University of Reading, United Kingdom

Abstract: A large fraction of papers in the climate literature includes erroneous uses of significance tests. A Bayesian analysis is presented to highlight the meaning of significance tests and why typical misuse occurs. The significance statistic is not a quantitative measure of how confident we can be of the ‘reality’ of a given result. It is concluded that a significance test very rarely provides useful quantitative information.

See also the article today published in of all places Mother Jones illustrating more flagrant misuses of statistics in the field of climate science.


  1. Finally - Ambauum 2010 has discovered a clear correlation between atmospheric CO2 concentration and something. A rise of 5.2% (from 368.14 to 387.35 between 1999 and 2009 has clearl forced an increase of 20% in the frequency of statistical malpractice or incompetence among climate scientists. How has this been allowed to happen?
    I think we should be told.

  2. Hi,

    I am the author of the paper. Thanks for your interest and for your posting.

    I need to add a disclaimer to those using this paper to bolster a climate-sceptic cause: this paper provides no evidence whatsoever that climate science is flawed. It highlights a frequently occurring, technical misuse of significance tests (essentially, the error of the transposed conditional - read the paper if you want to know more).

    It is not clear to what extent this misuse has muddied the waters of discussions in climate science, although I admit there may be a risk: significance thresholds are being used to give an air of quantified credibility to certain statistical results. The significance test clearly quantifies something, but it does not quantify the credibility of a result.

    Maarten Ambaum