23 July 2015

Data Horrors

"The great tragedy of science -- the slaying of a beautiful hypothesis by an ugly fact."  Thomas H. Huxley.

Sometimes, though, you have to pay attention to just how ugly the observation (fact) is.  And even more to how ugly a collection of observations is.  Science fair project I judged a couple of years ago, the student mentioned his methods for keeping the experiment, which had to be untouched while going, out of reach of his young brother.  This student has a firm grasp of the ugliness of data and trying to collect it.  I gave him high marks.

I also mentioned a story or two I knew of data collection challenges.  I'll share them and some others here, and invite you to add your own.

One family of ocean data comes from buoys floating on top of the ocean.  A lot of the ocean is far from land, therefore far from perches for birds.  Sea gulls and other birds are often grateful for the lovely perches we're putting out for them.  Unfortunately, it does not help the accuracy of your wind speed measurements to have a bird sitting on your gauge.  Birds sitting on the solar panel reduce your energy available/recharge rate, and thence maybe lead to data outages while waiting for recharging. Guano is great for fertilizer, but wrecks havoc on the accuracy of your temperature, pressure, and moisture readings.

Walrus don't mind taking a rest every now and then either.  They're not normally a threat to wind speed measurement (which is at the top of the buoy).  But we also want to get wave measurements -- how high are they, how fast are they, what direction are they going.  Having a walrus or two on your buoy slows its ability to respond, and may suppress the peaks of the measured waves.

On land, your instrument enclosures (the Stevenson Screen for instance) provide a nice place for bees, wasps, small birds to nest.  Squirrels like to play with them too.  A beehive next to your thermometer does not help its accuracy.

Back at sea, I once got a call about a problem buoy.  It was reporting extremely high temperatures near noon because the paint had been stripped during a storm, and the now-bare metal was reflecting sunlight onto the marine thermometer.

That should get you started for remembering your own horror stories about data collection.

Recently saw someone on the web taking the line that if data wasn't perfect, you should throw out everything from that instrument or site.  Well, no.  If you did that, you'd never have any data to work with.  For my examples, you mostly just ignore the data during the period you've got a walrus infestation.  But there are other kinds of things which affect your observing, and which you might be able to compensate for.