06
Mar

## The importance of simulating the extremes

Simulation is commonly used by statisticians/data analysts to: (1) estimate variability/improve predictors, (2) to evaluate the space of potential outcomes, and (3) to evaluate the properties of new algorithms or procedures. Over the last couple of days, discussions of simulation have popped up in a couple of different places.

First, the reviewers of a paper that my student is working on had asked a question about the behavior of the method in different conditions. I mentioned in passing, that I thought it was a good idea to simulate some cases where our method will definitely break down.

I also saw this post by John Cook about simple/complex models. He raises the really important point that increasingly complex models built on a canonical, small, data set can fool you. You can make the model more and more complicated - but in other data sets the assumptions might not hold and the model won't generalize. Of course, simple models can have the same problems, but generally simple models will fail on small data sets in the same way they would fail on larger data sets (in my experience) - either they work or they don't.

These two ideas got me thinking about why I like simulation. Some statisticians, particularly applied statisticians, aren't fond of simulation for evaluating methods. I think the reason is that you can always simulate a situation that meets all of your assumptions and make your approach look good. Real data rarely conform to model assumptions and so are harder to "trick". On the other hand, I really like simulation, it can reveal a lot about how and when a method will work well and it allows you to explore scenarios - particularly for new or difficult to obtain data.

Here are the simulations I like to see:

1. Simulation where the assumptions are true There are a surprising number of proposed methods/analysis procedures/analyses that fail or perform poorly even when the model assumptions hold. This could be because the methods overfit, have a bug, are computationally unstable, are on the wrong place on the bias/variance tradeoff curve, etc. etc. etc. I always do at least one simulation for every method where the answer should be easy to get, because I know if I don't get the right answer, it is back to the drawing board.
2. Simulation where things should definitely fail I like to try out a few realistic scenarios where I'm pretty sure my model assumptions won't hold and the method should fail. This kind of simulation is good for two reasons: (1) sometimes I'm pleasantly surprised and the model will hold up and (2) (the more common scenario) I can find out where the model assumption boundaries are so that I can give concrete guidance to users about when/where the method will work and when/where it will fail.

The first type of simulation is easy to come up with - generally you can just simulate from the model. The second type is much harder. You have to creatively think about reasonable ways that your model can fail. I've  found that using real data for simulations can be the best way to start coming up with ideas to try - but I usually find that it is worth building on those ideas to imagine even more extreme circumstances. Playing the evil demon for my own methods often leads me to new ideas/improvements I hadn't thought of before. It also helps me to evaluate the work of other people - since I've tried to explore the contexts where methods likely fail.

In any case, if you haven't simulated the extremes I don't think you really know how your methods/analysis procedures are working.

05
Mar

## Characteristics of my favorite statistics talks

I’ve been going to/giving statistics talks for a few years now. I think everyone in our field has an opinion on the best structure/content/delivery of a talk. I am one of those people that has a pretty specific idea of what makes an amazing talk. Here are a few of the things I think are key, I try to do them and have learned many of these things from other people who I’ve seen speak. I’d love to hear what other people think.

Structure

1. I don’t like outline slides. I think they take up space but don’t add to most talks. Instead I love it when talks start with a specific, concrete, unsolved problem. In my favorite talks, this problem is usually scientific/applied. Although I have also seen great theoretical talks where a person starts with a key and unsolved theoretical problem.
2. I like it when the statistical model is defined to solve the problem in the beginning, so it is easy to see the connection between the model and the purpose of the model.
3. I love it when talks end by showing how they solved the problem they described at the very beginning of the talk.

Content

1. I like it when people assume I’m pretty ignorant about their problem (I usually am) and explain everything in very simple language. I think some people worry about their research looking too trivial. I have almost never come away from a talk thinking that, but I frequently leave talks confused because the background material wasn’t clear.
2. I like it when talks cover enough technical detail so I can follow the basic algorithm, but not so much that I get lost in notation. I also struggle when talks go off on tangents, describing too many subproblems, rather than focusing on the main problem in the talk and just mentioning subproblems succinctly.
3. I like it when proposed methods are compared to the obvious straw man and one legitimate competitor (if it exists) on a realistic simulation/data set where the answer is known.
4. I love it when people give talks on work that isn’t totally finished. This type of talk is scary for two reasons: (1) you can be scooped and (2) you might not have all the answers. But I find that unfinished work leads to way more discussion/ideas than a talk about work that has been published and is “complete”.

Delivery

1. I like it when a talk runs short. I have never been disappointed when a talk ended 10-15 min early. On the other hand, when a talk is long, I almost always lose focus and don’t follow the last part. I’d love it if we moved to 30 minute seminars with more questions.
2. I like it when speakers have prepared their slides and they have a clear flow and don’t get bogged down in transitions. For this reason, I don’t mind it when people give the same talk a bunch of places. I usually find that the talk is very polished.
13
Jan

## In the era of data what is a fact?

I’m looking for reader input on whether and when New York Times news reporters should challenge “facts” that are asserted by newsmakers they write about.

He goes on to give a couple of examples of qualitative facts that reporters have used in stories without questioning the veracity of the claims. As many people pointed out in the comments, this is completely absurd. Of course reporters should check facts and report when the facts in their stories, or stated by candidates, are not correct. That is the purpose of news reporting.

But I think the question is a little more subtle when it comes to quantitative facts and statistics. Depending on what subsets of data you look at, what summary statistics you pick, and the way you present information - you can say a lot of different things with the same data. As long as you report what you calculated, you are technically reporting a fact - but it may be deceptive. The classic example is calculating median vs. mean home prices. If Bill Gates is in your neighborhood, no matter what the other houses cost, the mean price is going to be pretty high!

Two concrete things can be done to deal with the malleability of facts in the data age.

First, we need to require that our reporters, policy makers, politicians, and decision makers report the context of numbers they state. It is tempting to use statistics as blunt instruments, punctuating claims. Instead, we should demand that people using statistics to make a point embed them in the broader context. For example, in the case of housing prices, if a politician reports the mean home price in a neighborhood, they should be required to state that potential outliers may be driving that number up. How do we make this demand? By not believing any isolated statistics - statistics will only be believed when the source is quoted and the statistic is described.

But this isn’t enough, since the context and statistics will be meaningless without raising overall statisteracy (statistical literacy, not to be confused with numeracy).  In the U.S. literacy campaigns have been promoted by library systems. Statisteracy is becoming just as critical; the same level of social pressure and assistance should be applied to individuals who don’t know basic statistics as those who don’t have basic reading skills. Statistical organizations, academic departments, and companies interested in analytics/data science/statistics all have a vested interest in raising the population statisteracy. Maybe a website dedicated to understanding the consequences of basic statistical concepts, rather than the concepts themselves?

And don’t forget to keep rating health news stories!