27 July 2009

How not to analyze climate data

Preface
The paper that prompts this post (and the preceding Introduction to time series analysis is McLean, de Freitas, and Carter, 2009. A reader suggested, in email, that I take a look. I'll recommend that to others as well. I won't carry out all suggestions, not least because I don't know all areas well enough to comment, but they are indeed welcome. And do at times result in a post here. There'll be some following notes as this paper opens several issues. For now, I'll stay with just the paper.

Comments have already appeared at OpenMind, Initforthegold, and Realclimate. In a fundamental sense, I won't be adding anything new. But the approach will differ and might show some features in ways that you might have missed in the comments over there. For instance, I mentioned the crucial bit that I'll be exploring here in a comment at Initforthegold, and Michael missed its significance on first reading. The fundamental was staring him in the face, but fundamentals aren't always easy to notice. When he did, it was 'forehead slap' time.

I've tagged this 'doing science' and 'weeding sources', as well as 'climate change'. Some issues of peer review will show up, as will a flag or two of mine which I find useful in weeding sources. The nominal topic of the paper "Influence of the Southern Oscillation on tropospheric temperature" is climate change. Recently I posted about scientific specificity. While it's entirely true that it doesn't work well to take that line in daily life, it's exactly what one should do with a scientific paper. One thing it means is that we keep an eye on whether the data used, or are used, support the argument that is made.

Begin
We start by reading the abstract. As a matter of doing science, the abstract usually makes the most eye-catching statements in the paper. It is the advertising section of the paper, so to speak. You want to say something here that will interest other scientists and get them to read your brilliant work. In this case, "That mean global tropospheric temperature has for the last 50 years fallen and risen in close accord with the SOI of 5–7 months earlier shows the potential of natural forcing mechanisms to account for most of the temperature variation."

SOI is the Southern Oscillation Index. It provides a number that is connected to the El Nino-Southern Oscillation (ENSO), which can then be used for further research, such as this paper. There are different ways of defining an SOI, which might be an issue if the effects the authors were working with were fairly subtle. But, as they are referring to explaining 68-81% of the variance (figure depends on which records are matched, and how large the domain examined is), we've left the realm of subtle. As the authors duly cite, there's nothing new in seeing a correlation between SOI and global mean temperatures. This is well-known. What is new is the extraordinarily high correlations they find, and that eye-catching conclusion that most of temperature variation for the last 50 years is driven by SOI.

For atmospheric temperatures, they use the UAH lower tropospheric sounding temperatures (paragraph 5) and for SOI, they use the Australian Bureau of Meteorology's index (para 7). If the abstract were an accurate guide, we'd expect that with those two time series in hand, they computed the correlations and found those very high percentages of variance explained. Or at least that they were that high with the noted 5-7 month lag. And here's where we get to the time series analysis issue that I was introducing Friday.

Three different things are done to the data sets before computing the correlations. One is to exclude certain time spans for being contaminated by volcanic effects on the temperatures. No particular time series analysis issue here. But the other two both have marked effects on time series. First (para 10) is to perform a 12 month running average. This, as I discussed Friday, mostly suppresses effects that are 1 year and shorter in period. Second is to take the difference between those means, 12 months apart (paragraph 14). As I described on Friday, this suppresses long term variation, and enhances short term variation. They assert that this removes noise, while, in fact, it amplifies noise (high frequency/short period components of the record). Alternately, they are defining 'noise' to be the long period part of the records -- the climate portion of the record.

The combined effect of the two filters is that both the high frequency and the low frequency parts of the records are suppressed. What is left is whatever portion of the two records lie in the mid-range frequencies. To return to my music analogies, what has been done is to set your equalizer in a V shape, with the highest amplitudes in mid-range. While the result has a connection to the original data, it is certainly no longer fair to say, as the authors do in the abstract, that their correlations are between SOI and temperatures.

Demonstration of filter effects -- sample series
The next 4 figures show k) the original time series, which I constructed by adding up some simple periodic functions l) the 12 month running average version m) the 12 month differencing of the original data and n) applying both filters as the authors did (minus volcanoes).

Original



12 Month Smoothing



12 Month Differencing



Both Filters



As expected, the running average smoothed out the series. In music terms, it suppressed the treble. That's the job of an averaging filter. The differencing made for a much choppier series than the original. That, too, can be desirable. But certainly the authors' comment about 'removing noise' is ill-founded. If we look at the variance in the time series, the original has a variance of 4.25. The running average decreased that to 2.69 (eliminating 37% of the variance). The differencing increased the variance 50%, to 6.47 (again, increased variance means more noise). Applying both filters produces the final figure, which has little resemblance to the original series. Not least, while the original looks to have a substantial amplitude at a period of 30 years (that appearance is entirely correct, I put in a 30 year period), the final product shows no sign whatever of the 30 year period. That is one of the jobs of a differencing filter -- remove the long period contributions. The filters have also suppressed the 15 year period that I put in, and, in general, turned my original series, which had equal contributions at 5 months, 1, 2, 3, 5, 7, 10, 15, 30 years into something that looks mostly like a 3 year period (count the peaks and divide that in to the time span for them) with a bit of noise.

Filter effects on SOI series
That was a warm up with a test series, where we know that there are no data problems of any sort, and we know exactly what went in. The real data of course have problems (this is always true, and one of the aspects of doing science), but they may not have problems that affect our conclusions. The next figure shows the smoothed (12 month running averages again) and then differenced (as in the paper) Australian SOI (labelled 'both' -- both the averaging and the differencing applied to the original data) (Note that I'm not showing the full curve, only 1950 to present, instead of 1879 to present -- the paper's analyses only covered, at most, 1958-2008).



You see that with both filters applied there are new peaks, missing peaks, and even the sign of the index can change (positive for negative, or vice versa). These are all signs that the filters have fundamentally altered the data set, so that whatever conclusion is drawn can only be drawn about 'data as processed by this filter', not the original data -- in contradiction to the statements in the paper and elsewhere by the authors that it is SOI that explains an extremely high portion of the variation in global mean temperature. Further, since the correlation is largely driven by the peaks, the high correlations can by largely a matter of how the filter creates or destroys peaks rather than the underlying data.

Response function
I mentioned Friday the amplitude spectrum -- show the amplitudes of the contributions from each period. Filters change the amplitude spectrum. That's their job. One thing, then, that you do to describe the filter is divide the amplitude at a period after processing with the amplitude before hand (this is known as the response function). An ideal filter will show a 1 for all periods except the ones you're trying to get rid of, where it will be 0. Real filters don't accomplish this, but that's the goal. So, to see the performance of the author's filter, I took their original SOI series, processed it through their filter, and then found the response function in this way. Those are the next figures. First is looking at cycles per year (frequency), letting us see well what happens at high frequencies. Second looks at the period (from 1-15 years).

Frequency Response Function



Period Response Function





There are some spikes in the curves, which have nothing to do with the filter. All that is happening there is the these are periods/frequencies which have little signal in the original series, so numerical processing issues can have large effects there (dividing by small numbers is hazardous). But the smooth curve is a fair description. The averaging filter suppresses the signal (response is close to 0) for frequencies of 1, 2, 3, 4, 5, 6 cycles per year. (With monthly data, 6 cycles per year is the highest that can be analyzed -- 2 months period.). The differencing filter also suppresses the very low frequencies (long periods), as we expected even with just the basic introduction from Friday. But take a look between 1.5 and 7 years. The response is greater than the input! Look, too, at the periods which are being amplified. A usual description of ENSO is 'an oscillation with a period of 3-7 years'.

Summary
So what do we really have? It isn't a correlation between SOI and global mean temperatures. Both were heavily filtered. What the authors actually compute is the correlation between the SOI time series and global mean temperature -- if you over-weight (response function is greater than 1, so it's an over-weighting) both series towards what is happening in the ENSO periods. The conclusion should really be "If you look only in the ENSO window, you see that ENSO accounts for a lot of variation in global mean temperature." One problem is, that isn't a new result. We already knew that ENSO was important in the ENSO periods. More important to the paper, in so doing, the authors cannot make any conclusion about explaining "most of the temperature variation". They've filtered out much of it, and never examined either the response function nor the effects of their filter on the inputs.

If what was desired was an analysis of global mean temperature response to SOI at ENSO periods, then both the authors should have been clear that this was their window, and they should have used a more suitable filtering process. When one goes back to the paper, it's also clear that no justification was ever made for using either filter, much less both. The filters were arbitrary, and as I've mentioned, we prefer to avoid arbitrary decisions in our papers. If no objective basis for setting up the filters could be found, the authors should have demonstrated that alternate choices did not affect their conclusions.

So, some 'weeding sources', or 'scientific specificity' signs:
* When a paper makes a conclusion about the correlation between A and B, verify that it is A and B that they are correlating.
* If a filter is applied, look for the authors to discuss a) why a filter is being applied at all, and b) why the particular filter they chose was used.




As is my custom, I've sent an email to one of the authors (de Freitas, the only one whose email was given in the paper) about this comment.

Some of the following blog posts will talk about the peer-review aspects that let this paper through. For now, see my old article peer review. One of the other notes (no idea when) will be about how the process continues after a bad paper gets through the peer review process. That is the comment and reply process, and I'll be writing Tamino about that (he's said in his comments that he's preparing a comment for the journal).

10 comments:

Jesús R. said...

A very good analysis and exposition, Bob, thank you very much. I specially liked the part of the response function, it's very enlightening and surprising (for me). Putting aside the shameful hype from their authors, the paper isn't even good in the only thing it would be useful for!

Thanks again for your view, I'll keep on reading.

Anonymous said...

You make the claim that dropping out time periods does not impact the time series analysis. I'm no expert on such things, but how do you "drop out" any chunk of data and not impact the frequency content of the remaining data? At a bare minimum, you would need to drop the data out in some graceful manner, akin to windowing in Fourier analysis, and even this impacts the results.

Steve Bloom said...

James Annan spotted yet another error.

Bob, you mention the comment and reply option, but isn't it the case that the editors can withdraw the paper? Other than a case of plagiarism, it's hard to imagine a more suitable circumstance for doing so.

Philip H. said...

Steve, And other then plagarism I don't recall the editors of a major journal doing that anytime in the last 5 to 10 years. At least not in any of the journals that come into our office . . .

Bob, what's that old saw about being able to draw a straight line between any two points . . . good work here on your part though.

Horatio Algeranon said...

Nicely penned...and pinned (and opined?) Penguin.

Horatio has nothing to add to the mix ... except a twist of rhyme

Robert Grumbine said...

Thanks folks.

Anon: certainly dropping out some periods affects the time series analysis! The thing is, that can be exactly why you're doing the time series analysis. For instance, if I'm interested in climate, then I'll do something to suppress the weather signals. If I were listening for the tuba in an orchestra, I'd filter out the flutes. One failing of the paper being, after filtering their time series, they made claims about the unfiltered data sets.

To do the job right, you're correct that we need to do our filtering in some graceful manner. It's also true that there are no perfect filters. But we can usually construct one that it good enough, and has demonstrably little effect except where we want it to. Little isn't zero, but it can be close enough.

Steve: I'm sure there are quite a few other errors in the paper. This was more a case of showing one that was severe enough, and understandable enough, that the paper should not have been published (in my not so humble opinion -- obviously the editor disagreed).

I don't think JGR has ever withdrawn a paper. Comment and reply is the most I've seen them do. Perhaps a new precedent will be set with this one. I certainly hope some emails and phone calls among the editorial board have been happening.

Horatio: Nice bit of poesy.

Hank Roberts said...

Some similar situations at other journals have led to resignations;
Singer
http://wah-realitycheck.blogspot.com/2009/07/singer-refrains-from-environmental.html

Von Storch
http://www.cfa.harvard.edu/~wsoon/1000yrclimatehistory-d/Sep5-CHEarticle.txt

Horatio Algeranon said...

While the method used by Carter et al to find the correlation between SOI and temperature may seem strange to some, I suspect most spectroscopists would probably recognize it, as “first derivative correlation”.

"First derivative correlation" is actually a standard method used in spectroscopy for "spectral searching/matching” -- ie, for finding the spectrum in a "library of spectra" that comes “closest” to an unknown "test" spectrum, as described here

The primary motivation for using the “first derivative” of the spectra rather than the spectra themselves is removing the effect of the “baseline” on the correlation -- removing the impact of “humps", "ramps", etc, effectively leaving only the “peaks” that were formerly "riding" on that baseline.

In many such (spectroscopic) cases, baseline removal is very desirable – hence, it’s widespread application. That is particularly true if one is primarily interested in knowing whether two spectra have peaks at the same (or nearly the same) locations, since the method tends to emphasize peak locations over peak intensities.

It stands to reason that finding the time "lag" between SOI and temperature might be a reasonable application of the first derivative correlation method* since that would merely involve shifting the two “spectra” until one got the best “match” -- aligning the corresponding peaks in the SOI and temperature "spectra" (eg, for the El Nino of 97/98).

But there are other applications which are not quite so reasonable and, in fact, for which baseline removal is actually very undesirable.

This is particularly true when the baseline -- an upward-sloping “ramp”, for example -- is an important part of the signal, as is the case with global temperature over the past few decades. (the ramp is due to greenhouse gases and removing it from the correlation greatly inflates the correlation)

The primary drawback of the first derivative correlation method (the potential gotcha) is that it can give a (false) positive "match" (high correlation) for spectra that have peaks in the same locations but with very different intensities (heights)

Unfortunately for Carter et al, the "gotcha" is more than just "potential" for the case of "matching" (correlating) the (smoothed) SOI “spectrum" with the (smoothed) temperature “spectrum". Removing the temperature "baseline" effectively means “flattening” the upward-sloping ramp due to greenhouse gases, so a "match" (high correlation) between “first derivative of SOI” and “first derivative of temperature” does not mean a match (high correlation) between SOI and temperature.

This actually demonstrates the potential problem that can arise if one simply takes a procedure that was developed for one realm (spectroscopy, where the baseline is often not important and in fact, something to be eliminated) and applies it in another where the same assumptions may not hold (in this case, where the baseline is important).

*Note: while Carter et al claim here that the derivative method was used "purely" to establish the time lag, the very high correlation they claim (in their published paper) allegedly between SOI and temperature indicates otherwise (indicates that "derivative correlation" was used there as well).

Tamino makes the latter point here.

Jesús R. said...

I don't have any training on statistics, so I may be asking something silly or decontextualized (don't bother to answer if it is so).

I understand that, by using the difference, you get a flat trend, but, if there is a warming trend, shouldn't that flat line be above the horizontal axis? More specifically, shouldn't that flat line be precisely at the trend's height? I mean, with differences, the trend becomes a constant, but just where the flat trend would be: Given a function of time

f(t)= ax+b

then the trend of the differences would be y=a (pointing at the slope of the trend).

If that's true, when using differences, wouldn't it be possible to compare trends by comparing the height at wich the time series are located?

Thanks.

PS. Horatio Algeranon, the final link is broken (I suppose it should be here).

Term papers said...

A great article indeed and a very detailed, realistic and superb keep posting..!