Best known for accurate election predictions, statistician Nate Silver is also the author of The Signal and the Noise: Why So Many Predictions Fail—But Some Don’t. Heather Bell, Managing Editor of Journal of Indexes, recently spoke with Silver.
IU: What do you see as the common theme among bad predictions? What most often leads people astray?
Silver: A lot of it is overconfidence. People tend to underestimate what the uncertainty that is intrinsic to a problem actually is. If you have someone estimate what they think a confidence interval is that’s supposed to cover 90 percent of all outcomes, it usually only covers 50 percent. You have upside outcomes and downside outcomes in the market certainly more often than people realize.There are a variety of reasons for this. Part of it is that we can sometimes get stuck in the recent past and examples that are most familiar to us, kind of what Daniel Kahneman called “the availability heuristic,” where we assume that the current trend will always perpetuate itself, when actually it can be an anomaly or a fluke, or where we always think that the period we’re living through is the “signal,” so to speak. That’s often not true—sometimes you’re living in the outlier period, like when you have a housing bubble period that you haven’t historically had before.
Overconfidence is the core linkage between most of the failures of predictions that we’ve looked at. Obviously, you can look at that in a more technical sense and see where sometimes people are fitting models where they don’t have as much data as they think, but the root of it comes down to a failure to understand that it’s tough to be objective and that we often come at a problem with different biases and perverse incentives—and if we don’t check those, we tend to get ourselves into trouble.
IU: What standards or conditions must be met, in your opinion, for something to be considered “predictable”?
Silver: I tend not to think in terms of black and white absolutes. There are two ways to define “predictable,” I’d say. One is by asking, How well we are able to model the system? The other is more of a cosmic predictability: How intrinsically random is something over the long run?I look at baseball as an example. Even the best teams only win about two-thirds of their games. Even the best hitters only get on base about 40 percent of the time. In that sense, baseball is highly unpredictable. In another sense though, baseball is very easy to measure relative to a lot of other things. It’s easy to set up models for it, and the statistics are of very high quality. A lot of smart people have worked on the problem. As a result, we are able to measure and quantify the uncertainty pretty accurately. We still can’t predict who’s going to win every game, but we are doing a pretty good job with that. Things are predictable in theory, but our capabilities are not nearly as strong.
Predictability is a tricky question, but I always say we almost always have some notion of what’s going to happen next, but it’s just never a perfect notion. The question is more, Where do you sit along that spectrum?