All Recent Stories Staff Community Trending Elections From Markos' Desk Comics Community Groups Community Spotlight Actions Civiqs Make a Donation

Help Desk Jobs Work With Us Advertising Overview

What is Epidemiology? (Thursday Night Health Series)

by DrSteveB

Community

(This content is not subject to review by Daily Kos staff prior to publication.)

Thursday, Jul. 10, 2008 Thursday, Jul. 10, 2008 at 3:44:27pm PDT

THURSDAY NIGHT IS HEALTH CARE CHANGE NIGHT, a weekly Daily Kos Health Care Series.

Today, in follow-up to my earlier diary "What is Public Health", we ask "What is Epidemiology"?

Epidemiology (literally: "the study of what is upon the people") is the study of how often disease and injury occur in different groups of people and why. We then use this information to plan and evaluate strategies to prevent illness and injury in society, and as a guide to the management of patients in whom disease has already developed.

The occurrence of any particular health problem is not distributed equally within a population: You may have noticed that within any group of people, be it your group of friends or your country, some folks get sick and some do not. Why do some people get heart disease and others do not? Why do some people die in car accidents and others do not? How does AIDS spread through a population?

Since disease or injuries do not occur at random, therefore they have causal and preventive factors that can be identified through systematic investigation of different populations or subgroups of individuals within a population in different places or at different times.

Basic Tools of Epidemiology:

The types of study methods listed below are in order from quickest/cheapest/easiest/weakest (cannot say that association means causation) to hardest/most expensive/strongest (association probably does mean causation).

Ecologic Study:

This is the weakest type of study, and is associated with the term "ecologic fallacy". This is also known as an Aggregate Study, since unlike in a survey no information is known about individuals. For example, we may notice from death certificates that there is more brain cancer in some cities then in others. And we may know from a separate industry survey that those cities also trend to have more cell phone users.

ecologic
(N1 and N2 are two different groups of people, the boxes with D and E refer to with and without Disease or Exposure)

However, from these two separate data sets we do not know if the individuals with brain cancer were the same individuals who had cell phones. Because all data are aggregate at the group level, relationships at the individual level cannot be empirically determined but are rather inferred from the group level. Thus, because of the likelihood of an ecologic fallacy It is at best hypothesis generating.

Cross-Sectional Surveys:
The picture below is meant to provide a schematic of the study design

cross-sectional or survey
(N=people, the boxes with D and E refer to with and without Disease or Exposure)

We inquire about the health and risk exposure of groups at a single moment in time and assessing differences. The cross-sectional study implicitly assumes that the study population has been exposed for a long time and will continue to be exposed if nothing intervenes. It is a particularly easy study to conduct, and identifies possible associations and worthwhile case-control or cohort studies for follow up, but the study may not confirm causes. These are essentially the same thing as political polls. First, we can see what the prevalence of a disease is. How many people have it. What is the rate at which it occurs. Is it going up or down over time. Epidemiologist's like to talk about this in terms of:

Person: Who is getting it?
Place: Where is it occurring?
Time: When did it start, is the frequency going up or down?

By looking at the cross-tabs, let us say by race, sex, income, etc. we can begin to see what are some of the associated (not necessarily causal) risk factors.

Case-Control Studies:
Now these are fun and are, along with Cohort studies, the true invention of Epidemiology. Here, we select a group of people who already have the condition we are investigating. Let's say lung cancer. And then we compare them to a comparison (control) group of people who we know do not have the disease, let us say their healthy next-door neighbors. So we have two groups of people based whether or not they have the disease. Now let's ask them all about possible risk factors, including whether they smoked tobacco.

case control

You can think of this as the "Why Me?" study.

The main advantage is that it enables study of any, even rare diseases, without having to follow thousands of people, making it generally quicker, cheaper, and easier than the cohort study. Primary disadvantages: There is a greater potential for bias, since we know the health status before the exposure is determined. For example, recall bias is likely to occur in cross-sectional or case-control studies where subjects are asked to recall exposure to risk factors. Subjects with the relevant condition (e.g. breast cancer) may be more likely to recall the relevant exposures that they had undergone (e.g. hormone replacement therapy) than subjects who don't have the condition. The same sort of bias can apply to a researcher pushing an agenda. Also, they do not allow for broader-based health assessments because we select only one type of disease for study.

Another often-neglected issue is that the generalizeability of the study is completely dependent on the Control group: I have a saying that "Cases are easy, Controls are hard." In the lung cancer example above, the neighborhood controls are probably pretty good. But what if, because it was cheaper, easier and more convenient, we had chosen our controls from other patients at a hospital? Well those folks are at the hospital because they are sick, maybe with heart disease or emphysema or chronic bronchitis. And so they would be more likely to ALSO be smokers, and the effect of smoking on lung cancer might be missed. A blatant example has been many studies that only look at men or only look at whites.

Cohort Studies:

Here we select a group of healthy people, the cohort. We first measure all sorts of possible exposures and risk factors that they may have. We then assess what happens to their health over time.

cohort

Think of it as the "What will happen to me?" study.

The most famous cohort study is Framingham Heart Study, which began in 1948 with 5,209 adult subjects from Framingham Massachusetts, and is now on its third generation of participants. They took then young healthy people and collected blood, asked questions about lifestyle and behavior and many measurements like height, weight and blood pressure. Then they waited for years to see who went on to get heart disease. Much of the now-common knowledge concerning heart disease, such as the effects of diet, exercise, and common medications such as aspirin, is based on this longitudinal study.

The design is less subject to bias because it measures exposure before scientists learn the health outcome. A cohort study is expensive, time-consuming and logistically difficult, making it most useful for relatively common diseases, where sample sizes do not have to be too large. Cohort studies are susceptible to other kinds of error such as confounding, differential loss to follow-up and other issues.

Randomized Controlled Trials:
The above methods are all observational. What is generally considered the best study design is one that is interventional, such as the individual double-blind randomized control trial. This is the sort of study they use for testing new drugs. A cohort of individuals are enrolled into the study. They are randomly assigned to get the drug or a placebo. If the group is large enough and randomization is done properly, then the presumption is that any differences in outcomes will be due to the drug.

rct

Stuff can still wrong, even in a randomized control trial. Differences between the intervention and placebo group can sneak in. People may drop out of one group or the other differentially.

Error

Random error is due to just probability, "by chance alone." This is the sort of error that you can reduce just by increasing the sample size of the survey or poll, or by other sampling techniques.

Systematic Error is error due to your methods.

For example you can have Selection bias where the group you are studying is not representative of the generalization you would like to make. For example, the famous Dewey beats Truman poll is an example where the polling sample was not representative of the actual voting public.

There can be Misclassification or Information bias, for example the person with the disease who recalls every possible exposure better then the healthy control person.

Then there is Confounding. That is when the association between the outcome and risk factor is not really there because in truth both are related to some other factor. For example, let go back to lung cancer. Let's say I asked not about smoking, but about whether people carried matches or lighters, as shown in this diagram:

confounding

In this study, I would indeed find that carrying matches is associated with lung cancer. Of course, just carrying matches does NOT cause lung cancer. The Confounder is Smoking, which itself it the true cause for both Match Carrying & Lung Cancer. Smoking is the third factor that is truly causal for the other two. It is false that match carrying causes lung cancer, so this is confounding

Okay... now it gets complex! And fun!

Effect Modification: This is when there is third factor which is antecedent to cause. It modifies magnitude of effect between exposure & outcome. It is not the same as confounding, since the Cause to effect relationship is true. For example, age is effect modifier for many conditions. Another example is shown below:

effect modification

The amount of cigarette smoking is an effect modifier; it modifies the magnitude of the effect of oral contraceptive use on the risk for heart attacks.

In the real world, most things are complex. Most diseases have multiple causes. Nature and Nurture, genetics and environmental exposure not only all play a role, but there are many different genes and different exposures, and they all interact with each other. Below are two examples:

This shows contingent or intermediate variables:
contingent-interdiate variables
Serum Lipids are contingent or intermediate variable between Diet & Heart Disease. Diet has true causal effect on serum lipids & serum lipids have true causal effect on heart disease. And there are also many other causal interactions going on.

Here is another example of Complex interactions:

Poverty is truly causal for both exposure to lead and child development problems (via other mechanisms such as diet, education, exposure to violence, etc.). Lead exposure is truly directly causal for child development problems. Child development problems are truly causal for lead exposure (pica behavior)... and lead exposure can cause poverty & child development problems can cause poverty!

Association and Causation

In epidemiology, public health and medicine, the so called Hill criteria have become the standard simplified starting points for the discussion of how one gets from association and correlation to believing causation:

Temporal relationship: Cause (exposure) must come before outcome (disease) Exposure always precedes the outcome. If factor "A" is believed to cause a disease, then it is clear that factor "A" must necessarily always precede the occurrence of the disease. This is the only absolutely essential criterion.

Strength of association: This is defined by the size or magnitude of the effect. We tend to believe that the stronger the association, the more likely it is that the relation is causal. The risk for lung cancer with smoking is about 11 fold (1100%).

Dose-Response Relationship: An increasing amount of exposure increases the risk. If a dose-response relationship is present, it is strong evidence for a causal relationship. However, as with specificity, the absence of a dose-response relationship does not rule out a causal relationship. A threshold may exist above which a relationship may develop. At the same time, if a specific factor is the cause of a disease, the incidence of the disease should decline when exposure to the factor is reduced or eliminated. More smoking results in more lung cancer. If increasing levels of CO2 in the atmosphere is the cause of increasing global temperatures, then "other things being equal", we should see both a commensurate increase and a commensurate decrease in global temperatures following an increase or decrease respectively in CO2 levels in the atmosphere.

Consistency: The association is consistent when results are replicated in studies in different settings using different methods. That is, if a relationship is causal, we would expect to find it consistently in different studies, using different methods, by different researchers, and in different populations.

Plausibility: Te association agrees with currently accepted understanding of pathological processes. However, studies that disagree with established understanding of biological processes may force a reevaluation of accepted beliefs. In other words, there needs to be some theoretical basis for making an association between a vector and disease, or one social phenomenon and another.

Consideration of Alternate Explanations: In judging whether a reported association is causal, it is good to determine the extent to which researchers have taken other possible explanations into account and have effectively ruled out such alternate explanations.

Experiment: The condition can be altered (prevented or ameliorated) by an appropriate experimental regimen. If you take away the exposure, all else being equal, then the outcome does not occur. Of course, some things may have multiple causes. Some non-smokers get lung cancer (especially if they live in a high radon environment).

Specificity: This is established when a single putative cause produces a specific effect. This is considered by some to be the weakest of all the criteria. It goes back to infectious diseases, where for example, measles virus causes measles. Other viruses may cause other diseases. Immunize against measles and you do not get measles, though you may get other diseases. The diseases attributed to cigarette smoking, for example, do not meet this criteria. When specificity of an association is found, it provides additional support for a causal relationship. However, absence of specificity in no way negates a causal relationship. Because outcomes (be they the spread of a disease, the incidence of a specific human social behavior or changes in global temperature) are likely to have multiple factors influencing them, it is highly unlikely that we will find a strict one-to-one cause-effect relationship between two phenomena.

Coherence: The association should be compatible with existing theory and knowledge. This is related to biological plausibility above. In other words, it is necessary to evaluate claims of causality within the context of the current state of knowledge within a given field. What do we have to sacrifice about what we currently know in a given area in order to accept a particular claim of causality. However, as with the issue of plausibility, research that disagrees with established theory and knowledge are not automatically false. They may, in fact, force a reconsideration of accepted beliefs and principles. But as we say, big claims require big evidence.

Analogy: This one sucks even more then specificity, and is loosely related to coherence. In some circumstances it would be fair to judge by analogy, argument by the effect of similar factors may be considered. For example, since we know that thalidomide and rubella can have an effect on fetal development, therefore we would be ready to accept slighter but similar evidence with another drug or another viral disease in pregnancy.

I am not going to go into this further except to say that we do have more sophisticated and less reductionist approaches supplementing this.

Criticism of Epidemiology:

Several folks have criticized epidemiology as a field. Within the field, this has included Alvan Feinstein. The most popularizing critic has been the journalist Gary Taubes. Mostly this comes down to "association is not causation" (duh... epidemiology invented and is all about that issue) and "stop scaring people."

On the one hand: It is true, that like any other bio-social science endeavor, the tools of epidemiology can be put to good use or to bad. It can be done well, or poorly. Different groups may come in predisposed to generating the findings they want. Results can be blown out of proportion... or ignored. This is true for any field.

However, not surprisingly, some of these critics such as Steve Milloy are well funded by corporate money, including from tobacco companies. It is part of the phony "anti-Junk Science" astroturf corporate movement. The same folks who do global climate change denial. For more on this see such books as David Michaels Doubt is Their Product: How Industry's Assault on Science Threatens Your Health and Trust Us We're Experts: How Industry Manipulates Science and Gambles with Your Future.

As a result of the possibly botched current Salmonella outbreak investigation (tomatoes? jalapeños?) I expect to see a lot more criticism. Some of it the usual Reagan-Bush era anti-government circularity: First they under-regulate it, under-inspect it, and under-fund it, and then when it does not work they blame government.

Learn More:

The American College of Epidemiology has a nice online explanatory show"About Epidemiology".

The Centers for Disease Control has some fun online modules teaching epidemiology: Epidemiology in the Classroom is part of the EXCITE program, developed by the CDC to teach students about the causes and prevention of disease and injury while improving their research and analytic skills. Students learn the scientific method employed by epidemiologists and use what they have learned to solve real disease outbreaks on their own. BAM! Body and Mind Teacher's Corner: Infectious Disease Epidemiology is a series of lessons introducing students to epidemiology through infectious diseases and the scientific methods epidemiologists use to investigate.

The National Academy of Science has developed a series of standards-based lessons on infectious diseases for grades 6-12

Robert Wood Johnson Foundation and the College Board have a whole series of Epidemiology teaching units covering many different topics, many very social and political.

Play Outbreak at Watersedge, an interactive game meant to introduce you to the world of public health as you help discover the source of the outbreak that has hit the small community of Watersedge and stop it before more residents get sick.