Two political economists at the University of Colorado recently published an analysis of the 2012 Presidential Election that predicted Romney would win. The model looked at many economic variables and claimed this model would have “correctly predicted all elections since 1980.”
The NerdWallet Election Model, by contrast, was created by analysts who studied econometric analysis at Princeton and Stanford and predicts that Obama will easily win the election.
Both models claim to be unbiased presentations of statistical fact without personal opinions, but come to opposite conclusions. How can this be? Put simply, the University of Coloado’s model is deeply flawed. Here’s why:
5 Reasons the University of Colorado Election Model is Dead Wrong
1. The CU model cheats
If I told you what time I showed up for work every day last week would you be able to predict what time I showed up for work every day last week? You would? Great! You’re as good at statistical prediction modeling as the University of Colorado.
Claiming to have created a model that would have “correctly predicted all elections since 1980” is complete nonsense when the model was created using data from all elections since 1980, the very data it claims to predict. In statistics we call this data mining. The proper procedure would have been to use a subset of the data to create the model (perhaps the 1980 and 1984 elections) and then test the model on an “out of sample backtest” of data that had not been used in creation of the model (the 1988 through 2008 elections). Only then do you get to claim your model is any good at making predictions.
2. The CU model depends on flawed assumptions.
Unemployment affects the chances of Democratic incumbents being reelected but not Republicans? Republicans’ results are linked to per capita income while Democrats are not? Really? In statistics you need to start with assumptions that make sense and then test them, not start with data and then allow for as many crazy assumptions as you need to create a model that “correctly predicted” all of your data.
3. The CU model includes variables it should not…
Back to my showing up to work example. If I gave you additional data on what color shirt I wore each day and what I ate for breakfast, would you be better able to predict what time I showed up for work? Probably not, but your model doesn’t know that. What is likely to happen is that your model will start telling you crazy things (I show up five minutes earlier on days I eat eggs for breakfast, but twenty minutes later if I wear a blue shirt) in order to fit all your extra variables to the data.
The University of Colorado model does this but to an even more extreme degree. It uses at least SIX explanatory variables even though it only has eight elections data points. This is way too many from a statistical perspective and is likely to lead to “spurious correlations” (i.e. blue shirts make you late for work or unemployment only affects Democrats).
4…and ignores important variables (like the candidate!)
A model based only on economic indicators can only be comprehensive if economic indicators are the only factors that drive election results. They’re not. There is evidence that candidate likability, campaign spending, and even the weather all play roles in election results. Take an extreme example: What if Republicans decided to run a horse instead of a person as their candidate? The University of Colorado model would say that the horse would be elected due to economic conditions since their model does not consider the candidate at all. Any good election prediction model must take into account human voters’ willingness to actually vote for a particular candidate.
5. The CU model results are virtually impossible
Even if we ignore all the methodology flaws of the CU model, its results are hard to accept given that they are statistically nearly impossible. For example, the CU model claims Romney will win Pennsylvania. Yet when likely voters in Pennsylvania were polled, they sided with Obama 51-42, 49-40, 47-42, 53-42, 49-43, and so on. In fact, not one single poll of likely voters in Pennsylvania all year predicted a Romney win. Obama is leading by nearly 10 points, a virtually impossible lead to overcome given that more than 95% of historical polls have been accurate within 7 points. Even if we grant the CU model Pennsylvania, there’s still Ohio…Virginia…Wisconsin…
In short, the CU model is complete nonsense. This really isn’t surprising given how poor political economists are at making forecasts, according to the New York Times. But don’t take our word for it. Below is a summary of what other models and markets think of Romney’s chances:
|Models & Markets||Based on||Romney||Obama||Winner?|
|New York Times||Polls||28.7%||71.3%||Obama|
|Washington Post||Economic Indicators||41.6%||58.4%||Obama|
|American University||Economic Indicators||Obama heavily favored||Obama|
Estimates as of August 28, 2012
Joanna Pratt is VP of Financial Markets at NerdWallet Investing, a financial literacy website that helps investors select better mutual funds for their 401(k) plans, find a better online brokerage offering options accounts, and make smarter investment decisions overall.