**Joe Blitzstein**

Joe Blitzstein is

Professor of the Practice in Statistics at Harvard University and co-director of the graduate program. He moved to Harvard after obtaining his Ph.D. with Persi Diaconis at Stanford University. Since joining the faculty at Harvard, he has been immortalized in Youtube prank videos, been awarded a “favorite professor” distinction four times, and performed interesting research on the statistical analysis of social networks. Joe was also the first person to discover our blog on Twitter. You can find more information about him on his

personal website. Or check out his Stat 110 class, now available

from iTunes!

**Which term applies to you: data scientist/statistician/****analyst?**
Statistician, but that should and does include working with data! I

think statistics at its best interweaves modeling, inference,

prediction, computing, exploratory data analysis (including

visualization), and mathematical and scientific thinking. I don’t

think “data science” should be a separate field, and I’m concerned

about people working with data without having studied much statistics

and conversely, statisticians who don’t consider it important ever to

look at real data. I enjoyed the discussions by Drew Conway and on

your blog (at http://www.drewconway.com/zia/?p=2378 and

http://simplystatistics.tumblr.com/post/11271228367/datascientist )

and think the relationships between statistics, machine learning, data

science, and analytics need to be clarified.

**How did you get into statistics/data science (e.g. your history)?**
I always enjoyed math and science, and became a math major as an

undergrad Caltech partly because I love logic and probability and

partly because I couldn’t decide which science to specialize in. One

of my favorite things about being a math major was that it felt so

connected to everything else: I could often help my friends who were

doing astronomy, biology, economics, etc. with problems, once they had

explained enough so that I could see the essential pattern/structure

of the problem. At the graduate level, there is a tendency for math to

become more and more disconnected from the rest of science, so I was

very happy to discover that statistics let me regain this, and have

the best of both worlds: you can apply statistical thinking and tools

to almost anything, and there are so many opportunities to do things

that are both beautiful and useful.

**Who were really good mentors to you? What were the qualities that really****helped you?**
I’ve been extremely lucky that I have had so many inspiring

colleagues, teachers, and students (far too numerous to list), so I

will just mention three. My mother, Steffi, taught me at an early age

to love reading and knowledge, and to ask a lot of “what if?”

questions. My PhD advisor, Persi Diaconis, taught me many beautiful

ideas in probability and combinatorics, about the importance of

starting with a simple nontrivial example, and to ask a lot of “who

cares?” questions. My colleague Carl Morris taught me a lot about how

to think inferentially (Brad Efron called Carl a “natural”

statistician in his interview at

http://www-stat.stanford.edu/~ckirby/brad/other/2010Significance.pdf ,

by which I think he meant that valid inferential thinking does not

come naturally to most people), about parametric and hierarchical

modeling, and to ask a lot of “does that assumption make sense in the

real world?” questions.

**How do you get students fired up about statistics in your classes?**
Statisticians know that their field is both incredibly useful in the

real world and exquisitely beautiful aesthetically. So why isn’t that

always conveyed successfully in courses? Statistics is often

misconstrued as a messy menagerie of formulas and tests, rather than a

coherent approach to scientific reasoning based on a few fundamental

principles. So I emphasize thinking and understanding rather than

memorization, and try to make sure everything is well-motivated and

makes sense both mathematically and intuitively. I talk a lot about

paradoxes and results which at first seem counterintuitive, since

they’re fun to think about and insightful once you figure out what’s

going on.

And I emphasize what I call “stories,” by which I mean an

application/interpretation that does not lose generality. As a simple

example, if X is Binomial(m,p) and Y is Binomial(n,p) independently,

then X+Y is Binomial(m+n,p). A story proof would be to interpret X as

the number of successes in m Bernoulli trials and Y as the number of

successes in n different Bernoulli trials, so X+Y is the number of

successes in the m+n trials. Once you’ve thought of it this way,

you’ll always understand this result and never forget it. A

misconception is that this kind of proof is somehow less rigorous than

an algebraic proof; actually, rigor is determined by the logic of the

argument, not by how many fancy symbols and equations one writes out.

My undergraduate probability course, Stat 110, is now worldwide

viewable for free on iTunes U at

http://itunes.apple.com/WebObjects/MZStore.woa/wa/viewPodcast?id=495213607

with 34 lecture videos and about 250 practice problems with solutions.

I hope that will be a useful resource, but in any case looking through

those materials says more about my teaching style than anything I can

write here does.

**What are your main research interests these days?**

I’m especially interested in the statistics of networks, with

applications to social network analysis and in public health. There is

a tremendous amount of interest in networks these days, coming from so

many different fields of study, which is wonderful but I think there

needs to be much more attention devoted to the statistical issues.

Computationally, most network models are difficult to work with since

the space of all networks is so vast, and so techniques like Markov

chain Monte Carlo and sequential importance sampling become crucial;

but there remains much to do in making these algorithms more efficient

and in figuring out whether one has run them long enough (usually the

answer is “no” to the question of whether one has run them long

enough). Inferentially, I am especially interested in how to make

valid conclusions when, as is typically the case, it is not feasible

to observe the full network. For example, respondent-driven sampling

is a link-tracing scheme being used all over the world these days to

study so-called “hard-to-reach” populations, but much remains to be

done to know how best to analyze such data; I’m working on this with

my student Sergiy Nesterko. With other students and collaborators I’m

working on various other network-related problems. Meanwhile, I’m also

finishing up a graduate probability book with Carl Morris,

“Probability for Statistical Science,” which has quite a few new

proofs and perspectives on the parts of probability theory that are

most useful in statistics.

**You have been immortalized in several Youtube videos. Do you think this****helped make your class more “approachable”?**
There were a couple strange and funny pranks that occurred in my first

year at Harvard. I’m used to pranks since Caltech has a long history

and culture of pranks, commemorated in several “Legends of Caltech”

volumes (there’s even a movie in development about this), but pranks

are quite rare at Harvard. I try to make the class approachable

through the lectures and by making sure there is plenty of support,

help, and encouragement is available from the teaching assistants and

me, not through YouTube, but it’s fun having a few interesting

occasions from the history of the class commemorated there.