I've been promising for a long, long time to write a series of diaries here about how lay readers can read, evaluate, and even understand peer-reviewed science papers and third-party articles about them, even when they lack a background in the field in question. Unfortunately, real life has been doing the sorts of unpleasant things that real life does, and I haven't accomplished that. Or much of anything else here except drive-by commenting when I have a moment.
I still don't have that diary series started. But here, I'll continue the process of taking apart a really bad (pdf link) bit of science writing that was used as justification for extraordinary claims in a discuss a few days ago -- and sadly has more than a little currency in certain dimly-lit corners of the wider internet.
Many of the things that are wrong with this paper are wrong with the sorts of peer-reviewed papers used to by climate change denialists, the anti-vaccine movement, and no shortage of sham medical claims. Below the swirly thingy, we'll take a break from reading the paper itself, and talk about how to use the last diary's three big questions.
Part I looked at the publisher, author, and the organization they are involved with.
Part II identified the three big questions a reader should ask about any science paper.
Last time, we asked three big questions about the paper. What are the researchers trying to find out? What did they do to find that out? What did they actually find out? And we dug into the abstract, foreward, and conclusion to more or less determine how the author intended those questions to be answered. We're ready to start looking at the science, right?
Wait! Before you flip your pdf open to the second column on page 3, and before we start trying to judge the quality of the science, we need to determine how to go about doing that.
The three big questions from the last diary have their answers in the paper. Now, we're going to ask more questions; this time, we're going to try to answer them ourselves. These questions are harder, and they can be tricky. We'll look at them in a little more detail before moving on.
- Did the research provide useful information about the premise of the paper?
- Did the results of the research support the conclusion claimed in the paper?
- Is there any reason to distrust what the paper says about the research?
Did the research provide useful information about the premise of the paper?
We learned last time that this paper, like pretty much all papers, asks a question. There's some reason research was done and a paper was published, after all! So one essential followup question has to be: did the stuff they did help answer that question? It's harder than it sounds.
Everyone probably remembers the idea of a controlled experiment from our high school science days. The idea is simple. Determine what we're testing, and keep everything else the same. Sometimes that's hard to do. If we're testing whether crops grow better when watered with Brawdo sports drink instead of water, we should have a control where we take identical plants in an identical environment, and water them normally.
But that "identical environment" thing is sometimes a challenge. And that is true a hundredfold with human studies. For one thing, there are no "identical plants" when it comes to people -- we're all different. And there's rarely an "identical environment", either. Plus there are ethical issues: if our researchers are testing the effects of a potentially harmful substance on health, like in this study, they can't go around injecting people with it and seeing what happens. Well, not anymore, anyway. So, with people especially, they can't ever get rid of all the variables. But they can try to minimize them as best as possible. With medical research, ideally subjects should be of similar age and health status. Often, they may want to look at each gender separately because some health issues effect men and women very unequally (just as an example, stroke is about 25% more common in men, but women are more likely to die from a stroke). Even things like socioeconomic status are important to consider: because of workplace conditions, diet, and access to health care, poor income and education are risk factors for conditions as varied as arthritis, respiratory infection, and coronary heart disease.
There is always some way that a human study will have failed to account for the differences between people. Scientists have to balance the need to control extraneous variables with the need to have enough people in the study pool to produce meaningful data. But if it makes sense that something would have a big difference on the results, its a matter of concern if the researchers ignored it, especially if they don't mention their decision to do so. When not looking at a human study, of course, there's a lot more latitude to control the conditions of the experiment. In general, if it would've been easy to exclude a confounding variable, and the researchers didn't, they ought to have a reason.
Of course, sometimes the controls work entirely differently. In an experiment testing whether a new diagnostic is better at detecting cancer, that isn't compared with not running a test -- it's compared with existing tests that have established detection rates, in what's called a positive control. But the bottom line is, it's not enough to do something in a (metaphorical) vacuum, look at the results, and claim that A causes B. And there are papers that are case studies -- where there isn't really any sort of experimentation or analysis, only the reporting of observed data (although those are much less common these days, and the paper we're taking apart isn't one of them).
Thinking about controlled research applies even to many papers that don't perform experimental science themselves. Review papers combine data from a lot of other experiments in an effort to produce more informative, more certain results than any one individual study. Often, these reviews will set conditions for which papers on the topic are included, and when these conditions are set poorly, problems with the end result can arise. That's the "cherry picking" that's used by several of the well-known global warming denialist papers.
Did the results of the research support the conclusion claimed in the paper?
Basically, did the experiment show the result the researchers say it did?
Sometimes, this is hard to determine. When the conclusion of the paper comes down to the nuances of statistical analysis, the lay reader isn't really going to bust out Student's t-test. Frankly, none of us will. But there are some signs when the paper is hiding a weakness in the connection between research and results.
The most obvious way you can answer "no" to this question is when the paper just discards the research and proclaims that everything works the way they said it did regardless. No good paper will ever do this, but you'll see it quite often in apologia for alternative medicine. The anti-vaccine movement has a field day with this technique, too ("Thiomersal causes autism. What, you removed the thiomersal from vaccines, and there are still autistic children? Um, well, thiomersal caused that autism; this autism is caused by something else bad in the vaccine.").
Otherwise, consider how the authors tell you they went from point A (the research) to point B (the conclusion). Most papers won't waste time on processes that are familiar to other researchers in the field, but do there seem to be big gaps? Most statistical processes in common use have names; do the authors hand-wave what they did without being specific ("Then, we applied statistics!").
Is there any reason to distrust what the paper says about the research?
Unfortunately, sometimes funny business occurs. Sometimes that takes the form of fraud that is more or less impossible for the lay reader to detect. Andrew Wakefield's infamous paper "connecting" the MMR vaccine with bowel disease and autism made it into the esteemed medical journal
The Lancet. Only after four years of other scientists failing to reproduce his results and the revelation of undisclosed financial conflicts of interest did the wheels start to fall off for Wakefield. No lay reader could have detected
that funny business; indeed, no one in the field did.
But some hanky-panky is easier to see. Take a look at images and tables. Do they make sense? Think about movie settings as an analogy; if there are skyscrapers in the background of that action scene, you know it wasn't really filmed in Washington, DC, no matter what they're telling you. Amazingly, no small percentage of the number of papers that are eventually retracted are cited for undisclosed image manipulation. I've even seen sloppy papers "illustrate" two different things with the same photograph! Do the numbers in the tables match what is being talked about in the text? Obvious typos are one thing that should be caught in editing but do occasionally slip past, but sometimes the data in one part of a really bad paper just doesn't match the data in another part.
And consider whether there is anywhere the authors could have added subjective wiggle room without letting the reader know about it. In the infamous Soon and Baliunas paper that attempted to disprove global warming, one of the most obvious flaws in the paper was that it talked about historical wet and dry periods -- but never gave any sort of standard for "wetness" or "dryness" (the other was that it pretended temperature trends in any region were always the same as temperature trends worldwide)!
Next time, we'll go back to the paper itself, armed with these questions as tools. And we'll crack open the first of the experiments the author describes and see how well it stands up to these questions (spoiler: poorly).