<div> </div> <div> </div> <div> <em><a href="http://en.wikipedia.org/wiki/Emily_Oster">Emily Oster</a> is an Associate Professor of Economics at Brown University. She is a frequent and highly respected <a href="http://fivethirtyeight.com/contributors/emily-oster/">contributor to 538 </a>where she brings clarity to areas of interest to parents, pregnant woman, and the general public where empirical research is conflicting or difficult to interpret. She is also the author of the popular new book about pregnancy:<a href="http://www.amazon.com/Expecting-Better-Conventional-Pregnancy-Wrong/dp/0143125702"> Expecting Better: Why the Conventional Pregnancy Wisdom Is Wrong--and What You Really Need to Know</a><b>. </b>We interviewed Emily as part of our <a href="http://simplystatistics.org/interviews/">ongoing interview series</a> with exciting empirical data scientists. </em> </div> <div> <em> </em> </div> <div> </div> <div> <b>SS: Do you consider yourself an economist, econometrician, statistician, data scientist or something else?</b> </div> <div> </div> </div>
EO: I consider myself an empirical economist. I think my econometrics colleagues would have a hearty laugh at the idea that I'm an econometrician! The questions I'm most interested in tend to have a very heavy empirical component - I really want to understand what we can learn from data. In this sense, there is a lot of overlap with statistics. But at the end of the day, the motivating questions and the theories of behavior I want to test come straight out of economics.
<div> <b>SS: You are a frequent contributor to 538. Many of your pieces are attempts to demystify often conflicting sets of empirical research (about concussions and suicide, or the dangers of water flouridation). What would you say are the issues that make empirical research about these topics most difficult?</b> </div> <div> <b> </b> </div> </div> </div>
EO: In nearly all the cases, I'd summarize the problem as : "The data isn't good enough." Sometimes this is because we only see observational data, not anything randomized. A large share of studies using observational data that I discuss have serious problems with either omitted variables or reverse causality (or both). This means that the results are suggestive, but really not conclusive. A second issue is even when we do have some randomized data, it's usually on a particular population, or a small group, or in the wrong time period. In the flouride case, the studies which come closest to being "randomized" are from 50 years ago. How do we know they still apply now? This makes even these studies challenging to interpret.
<div> <b>SS: Your recent book "Expecting Better: Why the Conventional Pregnancy Wisdom Is Wrong--and What You Really Need to Know" takes a similar approach to pregnancy. Why do you think there are so many conflicting studies about pregnancy? Is it because it is so hard to perform randomized studies?</b> </div> <div> <b> </b> </div> </div> </div>
EO: I think the inability to run randomized studies is a big part of this, yes. One area of pregnancy where the data is actually quite good is labor and delivery. If you want to know the benefits and consequences of pain medication in labor, for example, it is possible to point you to some reasonably sized randomized trials. For various reasons, there has been more willingness to run randomized studies in this area. When pregnant women want answers to less medical questions (like, "Can I have a cup of coffee?") there is typically no randomized data to rely on. Because the possible benefits of drinking coffee while pregnant are pretty much nil, it is difficult to conceptualize a randomized study of this type of thing.
<div> </div> <div> Another big issue I found in writing the book was that even in cases where the data was quite good, data often diverges from practice. This was eye-opening for me and convinced me that in pregnancy (and probably in other areas of health) people really do need to be their own advocates and know the data for themselves. </div> </div> </div>
<div> <b>SS: Have you been surprised about the backlash to your book for your discussion of the zero-alcohol policy during pregnancy? </b> </div> <div> <b> </b> </div> </div> </div>
EO: A little bit, yes. This backlash has died down a lot as pregnant women actually read the book and use it. As it turns out, the discussion of alcohol makes up a tiny fraction of the book and most pregnant women are more interested in the rest of it! But certainly when the book came out this got a lot of focus. I suspected it would be somewhat controversial, although the truth is that every OB I actually talked to told me they thought it was fine. So I was surprised that the reaction was as sharp as it was. I think in the end a number of people felt that even if the data were supportive of this view, it was important not to say it because of the concern that some women would over-react. I am not convinced by this argument.
<div> <b>SS: What are the three most important statistical concepts for new mothers to know? </b> </div> <div> <b> </b> </div> </div> </div>
EO: I really only have two!
<div> </div> <div> I think the biggest thing is to understand the difference between randomized and non-randomized data and to have some sense of the pittfalls of non-randomized data. I reviewed studies of alcohol where the drinkers were twice as likely as non-drinkers to use cocaine. I think people (pregnant or not) should be able to understand why one is going to struggle to draw conclusions about alcohol from these data. </div> <div> </div> <div> A second issue is the concept of probability. It is easy to say, "There is a 10% chance of the following" but do we really understand that? If someone quotes you a 1 in 100 risk from a procedure, it is important to understand the difference between 1 in 100 and 1 in 400. For most of us, those seem basically the same - they are both small. But they are not, and people need to think of ways to structure decision-making that acknowledge these differences. </div> </div> </div>
<div> <b>SS: What computer programming language is most commonly taught for data analysis in economics? </b> </div> <div> <b> </b> </div> </div> </div>
EO: So, I think the majority of empirical economists use Stata. I have been seeing more R, as well as a variety of other things, but more commonly among people who do heavier computational fields.
<div> <b>SS: Do you have any advice for young economists/statisticians who are interested in empirical research? </b> </div> </div> </div>
<div> EO: </div> <div> 1. Work on topics that interest you. As an academic you will ultimately have to motivate yourself to work. If you aren't interested in your topic (at least initially!), you'll never succeed. </div> <div> 2. One project which is 100% done is way better than five projects at 80%. You need to actually finish things, something which many of us struggle with. </div> <div> 3. Presentation matters. Yes, the substance is the most important thing, but don't discount the importance of conveying your ideas well. </div> </div>