Epidemiology…that’s skin, right?
Nope, that’s dermatology. So, what is an epidemiologist? Epidemiologists are scientists who do studies to answer questions or generate hypotheses about health. Epidemiologists compare groups of people to see if they are the same or different, and they do this by doing studies. Epidemiology studies look for patterns in populations, and they often aim to figure out how common a specific disease is, and what causes it.
I became an epidemiologist because I volunteered as a Spanish interpreter at a free medical clinic when I was in college. I learned so much about the patients through observing and participating in their interactions with the healthcare system. I loved the way these observations could reveal patterns and that these patterns could help us to understand how to improve human health; it was clear that epidemiology was my path forward.
Patients often arrived at the clinic seemingly nervous and uncomfortable, perhaps experiencing one of their worst days. Speaking in a patient’s native language introduced some familiarity to an otherwise uncomfortable environment. It became clear to me that treating the patient was not just about the medicine; it was about communication, and that the best way to do this was to make sure that all participants were speaking the same language. So, like speaking Spanish to a native Spanish speaker, readers of epidemiology can more easily understand studies when we speak the “language” of epidemiology.
In less time than it takes to say, “these unprecedented times” “epidemiologist” has become a household word. Due to COVID-19, more people than ever are reading epidemiology and other scientific articles, and with this comes the responsibility of being a careful reader. How many times have you read just the abstract or conclusion paragraph of an article, or worse – just the title? I admit I am guilty of this as well. If there’s anything we’ve learned during the pandemic, it’s that the value of science is little if it’s not effectively communicated.
All scientific studies are flawed, and epidemiology studies are no exception. If a study were able to perfectly replicate reality, we would already know the answer to our question, and we wouldn’t need to do a study in the first place. It’s expected that a study will have limitations and it is okay if it does. The important thing is that we, as readers of epidemiology, think critically about how these limitations might impact how we understand and apply the results.
A change to just one step in the recipe can sometimes drastically affect the finished product. In the same way a baker decides which ingredients to use and in what order, epidemiologists make a lot of decisions when designing and conducting a study. They decide on things like, 1) what data they need to collect to figure out if an exposure leads to a disease, and how they are going to collect them, 2) what is being compared when looking for “increased risk”, and 3) how much uncertainty exists in reporting “statistically significant” results. Each of these decisions helps us to figure out how to appropriately apply what we learned from the study. By asking questions about these three things you can better understand the results of any epidemiology study.
Let’s walk through an example.
You may have heard in the news that drinking coffee makes you live longer. What you might not have realized was that the information behind these reports likely came from epidemiology studies. One such example is a 2018 study by Loftfield et al. which looked at coffee drinking and mortality in a UK population (Loftfield et al., 2018). As a daily coffee drinker, myself, I was enthusiastic about this news; I wanted to know if the results of this study applied to me.
How do epidemiologists figure out if an exposure leads to a disease? Epidemiologists collect data to describe a population; they categorize individuals according to whether they have the disease and/or exposure, and then they compare the groups to see if they are different. In the Loftfield coffee study, for example, the authors collected data to describe coffee drinking behavior; this is the exposure in the study. Then they collected data on mortality, and compared it among the different exposure groups. Sometimes it is not possible to measure an exposure or a disease directly, so epidemiologists might use an indirect measurement instead. In the Loftfield study, for example, coffee consumption was measured by asking participants how much coffee they drank. Indirect measurements are accepted and often necessary in epidemiology studies, but it is important for the reader to know that the data represent an estimate, and not a measurement.
What does “increased risk” really mean? Conclusions from epidemiology studies often refer to increased or decreased risk, odds, or hazard. They might also say that a certain exposure makes you more likely to develop a disease. These statements mean nothing if you don’t know the answer to one simple question: “compared to what?” In the coffee study the authors compared people who drank coffee to those who didn’t drink any coffee at all (Loftfield et al., 2018). The authors could have chosen, however, to define this group differently. For example, they could have compared people who drank fewer than 2 cups a day with those who drank more. Now that we know the comparison group, we are able to interpret the results.
What does “statistically significant” mean? In epidemiology studies results are often described as either “statistically significant” or “not statistically significant.” Statistical significance tells how sure we are that a study’s results didn’t happen by random chance. It is often represented in studies by a p-value or a confidence interval, like in the coffee study. Whether a result is considered “significant” or not is dependent on a threshold which is chosen by the investigator. Most epidemiologists choose a standard value, but they are free to choose a different one if they wish. For example, an epidemiologist might choose a different threshold if a study is exploratory and a greater level of uncertainty is acceptable. When an epidemiologist describes a result as “statistically significant” it means that there is a low probability that the results were due to chance, but what is considered “low” is defined by the epidemiologist. Loftfield et al., reported reduced mortality among participants who drank 2-3 cups of coffee per day compared to those who drank none. As a three-a-day consumer, I feel optimistic about these results.
It’s important for us as epidemiology users to invest the time required to read, understand, and inquire. Asking questions about how variables are defined and measured, what is being compared, and the certainty of results is a good place to start. I hope that this article has helped you to understand the ingredients of an epidemiology study, and to speak the language of epidemiology. Keep in mind that results from epidemiology studies add to a body of evidence; they are not necessarily intended to provide a definitive answer to a research question. Their findings should be used in concert with findings from other epidemiologic and non-epidemiologic studies. With this approach we can pave the way for a more active public engagement in the literature we consume.
References
Loftfield E., Cornelis M.C., Caporaso N., Yu K., Sinha R., & Freedman N. (2018). Association of Coffee Drinking With Mortality by Genetic Variation in Caffeine Metabolism: Findings From the UK Biobank. JAMA Internal Medicine, 178(8), 1086–1097. doi:10.1001/jamainternmed.2018.2425