Tetlock’s Superforecasting and political science
What does an accurate prediction technology teach us?
I thought that I had imbibed most of the Tetlockian worldview when I read his Expert Political Judgment, but his latest, entitled Superforecasting, goes much further. While the earlier book mainly showed how experts were not very good predictors (though some were better than others), in the new one he provides a concrete template for becoming a better predictor.
And as good as his advice seemed to me – I will detail it in a moment - I was still left with a question. Reflecting my training in political science, I wanted to know how the tools of superforecasting that he lays out help us to better understand politics or even produce better politics. I’m not sure that they do – less sure on the first point than the second - though this does not detract from his achievement. What the book ultimately tells us is, I think, is more how to be a better thinker than how the political world works. And maybe that was his aim in the end.
How to be a superforecaster
I’ll start with a summary of his results. The template for forecasting political events is something like the following (with the caveat that he is concerned with forecasts over fairly short time horizons and doesn’t include black swans - ie, highly improbable but earth-shaking events):
Unpack the question into components. Distinguish as sharply as you can between the known and unknown and leave no assumption unscrutinized. Adopt the outside view and put the problem into a comparative perspective that downplays its uniqueness and treats it as a special case of a wider class of phenomena. Then adopt the inside view that plays up the uniqueness of the problem. Also explore the similarities and differences between your views and those of others - and pay special attention to prediction markets and other methods of extracting wisdom from crowds. Synthesize all these different views into a single vision as acute as that of a dragonfly. Finally, express your judgment as precisely as you can, using a finely grained scale of probability.
There are some other tricks as well. One should be a supernewsjunkie and constantly integrate new information and update one’s forecasts, something he calls “perpetual beta”. Being part of a well-run team where ideas are discussed openly helps as well. All together this gives one the best chance at a good forecast.
These results are all based on data from a series of forecasting tournaments that Tetlock has organized over the years. Tetlock doesn’t just analyze the quantitative results, slicing and dicing them in a variety of ways, but he conducts interviews with the more successful forecasters to determine how exactly they do it. The book is breezily written, but underneath is a mass of experimentation and data-gathering.
If we accept all this and agree that Tetlock has shown us the path to better forecasting (and who am I to challenge it), what follows? There is an assumption throughout the book that better forecasting is a good thing. I’ll grant that it is surely a good thing for any individual. If you can see the future more accurately than others, you can make better investments and succeed in a variety of areas. But how does better forecasting help us as a society? I’ll start with our understanding of politics since the book mainly focuses on predicting political events. Then I’ll think more broadly about public policy.
Superforecasting and theory building
Let’s first consider how superforecasting adds to our knowledge of political science. Many of us would agree that prediction is the ultimate test of a theory. We should prefer theories that make better out-of-sample predictions. But the converse may not be true. The ability to predict well does not mean that one has a good theory. And looking closely at the characteristics of a Tetlockian superforecaster indicate that they diverge from good theorizing in a number of respects.
Indeed, in Tetlock’s first book on the subject, he found that scholars who relied on general theories of politics tended to predict less well than those who were more open-minded and knew many little things, In the terms he helped to popularize, the foxes outperformed the hedgehogs. In Superforecasters, the tools used to make good predictions are even more different than those used to make theory. The basic idea is to start with base rates of a phenomenon and then continually collect new information about the case, making adjustments to one’s forecast all the while.
The exercise looks much like what is called data-mining, which political science students are warned away from. Yes, we might come up with better fitting explanations by continually adding new considerations to our model (as we would with a machine learning algorithm), but we would never learn why we got these better forecasts. It wouldn’t lead us to a theory - a cause and effect relationship - which is what political scientists consider knowledge.
Political scientists following Tetlock’s model would end up looking more like seers, producing correct answers without a theory that explains why. Sure, they could reproduce the process of getting to their correct prediction - the base rates and the factors that caused them to adjust it in these cases - though a lot of this seems to be done by feel; few of his superforecasters use formulas to do this - but they would rarely have a causal account that explains how factors X, Y, and Z are related to the outcome.
This seems to revisit some of the debate about whether political science failed by not predicting events like the fall of communism or the Arab spring. Here’s how I responded to that criticism in another piece:
...although prediction is an important goal of social science and theories should ultimately be judged by their predictive accuracy, social science should not be held to the standard of making predictions of the form: event X will happen at time Y. Social scientists are not soothsayers. Popper pointed out that even physics makes very few predictions of this type (the exceptions are events like solar eclipses). Instead, social science theories will typically make conditional predictions of the form: if X occurs, then Y is likely to follow.
The second reason [that we failed to predict the fall of communism] is that we have good explanations for why the regime collapse came so suddenly and unpredictably. These are the “tipping point” and “bank run” explanations described in the previous section. Kuran (1991) thus argues that the timing of such revolutions are almost inherently unpredictable.
In short, I am not sure that superforecasting in Tetlock’s portrayal would have much influence on the sort of theory-building that we do in political science, though as I note below, I think it helps us in other ways, for example, by showing us how to be better thinkers.
And come to think of it, maybe Tetlock’s contribution is to show us the errors in thinking that political actors (along with the rest of us) might make - ignoring base rates, discounting alternative perspectives, and not updating, among others. In the process of trying to improve our thinking, Tetlock might be giving us a more realistic view about the ways that decision makers go wrong.
Digression: superforecasting and qualitative research
As an aside, Tetlock’s method seems to me a close cousin to the qualitative model of social science laid out in Goertz and Mahoney’s A Tale of Two Cultures. They argue that there are two ways of thinking about causality in social science, the two cultures of their title.
As opposed to the quantitative model, which is concerned with general relationships, qualitative theorists are often interested in correctly explaining unique, individual cases, say, the causes of World War I. Similarly, Tetlock is interested in correctly predicting individual cases. While one is retrospective and the other prospective, the qualitative model and the forecasting model share a focus on unique cases and getting them right. A difference, however, is that the qualitative model is interested in causality, while the forecasting model is content with a point prediction.
By contrast, the quantitative model of social science focuses on generalizations and understanding relationships on average. The quantitative model may miss individual cases, but on average it gets things right. With a bit of squinting, one could see quantitative reasoning as the process of deriving base rates and qualitative models as making adjustments to get the details right.
Another model of forecasting
Tetlock’s is not the only model of forecasting. In fact, another model, the one pioneered by Bruce Bueno de Mesquita (BDM), might be more likely to have (and in fact already has had) an influence on political science. His method is based on a systematic evaluation of key actors’ preferences, influence, and interest in the issue and their subsequent aggregation in a game theoretic model. This technique has given rise to selectorate theory, which has had substantial influence in political science. And like Tetlock’s superforecasters, BDM seems to have achieved considerable predictive accuracy with his method.
Interestingly, Tetlock is critical of BDM’s work and doesn’t cite it in Superforecasting (neither does BDM cite Tetlock in his book, The Predictioneer’s Game), though both have written on the other’s work. I think there are two areas where Tetlock is skeptical of BDM’s approach, though he puts them in slightly different terms than mine below.
One is the inputs into BDM’s models. These inputs are key to making the model work. They include a quantitative ranking of potential outcomes, an identification of the key actors along with their private preferences, their influence on politics, and the salience they assign to the issue. This is not trivial information, though BDM says that it is often freely available.
As BDM admits, getting these estimates wrong would spoil a forecast. I’m willing to guess that BDM is very good at setting these parameters, but I wonder how replicable his skill is. What sort of consistency would we have if others tried to assign the values. The process here seems similar to the Tetlockian recommendation of being a supernewsjunkie and I think this is why Tetlock believes that the experts “are doing all the heavy lifting” in BDM’s model. I’m more sympathetic to the contribution of the game theoretic model - which isn’t very visible in BDM’s work and is sometimes referred to in his book as just “the computer” - but it would be worth either determining inter-coder reliability for the inputs to the model or finding some way to automate the process, say, using text to data models for assigning preferences, interest, and salience.
From my perspective, there is substantial overlap between Tetlock’s and BDM’s approaches. Both depend on deep knowledge to get the details of the players and the interaction right. Both encourage practitioners to look at the problem from all sides, not just one’s own. Both advise users to break problems down into smaller pieces. And both are aware that unpredictable factors can upset forecasts. It is somewhat surprising that Tetlock’s superforecasters don’t plug and chug after gathering information like BDM, but perhaps they are doing something like that under the hood with the help of intuition and experience.
Tetlock’s other concern is about BDM’s success. He is skeptical about his claims to have a very high, perhaps even a 90%, success rate as many of the predictions are classified or proprietary. BDM rightfully points out that he has put many of his predictions in print. I tend to believe that BDM is a very good predictor, but if we wanted to push harder on Tetlock’s point, it is possible that BDM is choosing cases or issues where his method works and avoiding ones where it does less well.
This is why Tetlock would ask BDM or a practitioner of his method to participate in a forecasting tournament where the cases are chosen by a third party. I would imagine that BDM could perform well in such a tournament focused on cases where there are strategic interactions that fit his model. But some of the prediction questions that Tetlock is interested in are less strategic and involve things like determining who will win an election. BDM’s methods would be less useful for those situations.
Prediction as a way to improve public policy
A second potential gain from superforecasting concerns public policy. Tetlock (and BDM) is optimistic that better prediction would improve politics itself. His thinking seems to be that these tools could be used by US policymakers and would make their interventions more successful.
But this presumes a number of things. First of all, it presumes that US policymakers have good intentions and would use superforecasting to benefit humanity, not just the US (at the expense of others) or their own self-interest. I’ll let readers decide how they think US or other policymakers would use this tool. Since the techniques are mostly public, you could substitute another country for the US in this paragraph.
BDM explicitly defends himself on this score. He argues mainly that more knowledge is a good thing in its own right. And shouldn’t we want the government to work with the best information? However, after describing offers he received to help overthrow Anwar Sadat and keep Mobutu Sese Seko in power, he puts the onus on the individual researcher. “It is the responsibility of each of us as individuals to withhold our expertise when we think its use will make the world a worse place.” This seems like somewhat weak beer for a potentially powerful tool and one which is mostly in the public domain.
The effectiveness of prediction also sometimes presumes that there are not rivals using these same tools. Given his work with DARPA, Tetlock seems to be thinking about defense and foreign policy scenarios where the US would have access to superforecasters but the other side would not.
Things become more complicated if we are competing with others who are using the same tools. That would make it hard to use the bluffs that BDM sometimes recommends. It might, however, help us avoid negative-sum wars, which includes most of them. If both sides see the inevitable end result, they might move directly there instead of fighting the war. Work on rationalist theories of war does suggest that lack of information or misinformation are key causes of war (see Fearon’s classic piece).
I would need to be a better game theorist to say with more confidence what sort of effect a functional predictive technology would have on international affairs, whether in the hands of just one power or available to everyone. I could imagine negative effects in addition to positive ones. Perhaps accurate predictions of positive outcomes to potential wars that weren’t otherwise considered might encourage aggression in cases where the uncertainty would have deterred policymakers. Probably someone has worked out these issues and I am just not aware of it.
Tetlock’s methods likely work better in contexts without a strategic interaction where the aim is simply to make programs work better and more efficiently. One could imagine things like predicting the spread of Covid or the consequences of particular measures like lockdowns or mask wearing. Such information would certainly be useful for improving policy responses and probably the benefits lay here more than in international relations. Maybe I am just being dense in downplaying the possibilities. I would have welcomed a section in the book making this case more systematically.
***
As I said at the start, I’m extremely impressed by Tetlock’s research program, not to mention his findings. This is real progress. What I’m not sure of is whether the superforecasting tool kit teaches us something about how politics works and whether it necessarily leads to better politics. Instead, I view it as progress in the study of the psychology of thinking, close in spirit to work by Tversky and Kahneman or behavioral economics to which Tetlock also feels kinship. Reading it has made me a better thinker and the techniques he advocates are ones that I will share with my students. And that is more than enough in my book.
Andrew, the great frustration of Expert Political Judgment was -- for me -- that the book was so thin on details about Tetlock's methods and procedures. Tetlock obviously did a great deal of work, but whenever you want to know precisely how he did something, he directs you to the "Technological Appendix" or the "Methodological Appendix." Those appendices are light on detail, and I found them hard to follow. As a result, I've never been able to repose much confidence in the book, even though I'd like to do so. (A related problem is that none of the raw materials are available for inspection. Perhaps it's understandable that the data aren't available -- but even the statistical code that Tetlock used to analyze the data isn't available. That's a shame, as the code would help to make clear how he analyzed the data.)
Is Superforecasting better in this regard?