Posted on | November 6, 2012 | 1 Comment
As I’ve been doing since 2000 – before the RCP average was a glimmer in some political geek’s left brain – I present the Federal Review Composite Poll. In 2000, 2004 and 2008 I kept a running update during the year, but just didn’t have the time to keep creating an electoral map, and my attempt at creating one in Flash or Java just never got off the ground. Before presenting this year’s final numbers, let me briefly explain. This is a weighted average. The weighting isn’t ad hoc and subjective – there’s no secret sauce. I merely weight each poll based on its sample size and mid-date of its sample time frame. Early in the campaign, the polls are weighted so that the most recent week’s polls are averaged and weighted 1.5 times the average of the prior week’s polls, which are 1.5x the week before that. If a pollster polls the same state 2 weeks in a row, that pollster is represented twice, but at different weights. I don’t throw out the older poll because it still conveys relevant information – either a mere swing in the margin of error, in which case keeping the poll helps to balance out any error in the newer poll. It also tells us if there’s movement in the race – but doesn’t allow the composite result to capture the movement fully until it is confirmed by other polling.
Now, at the end of the campaign, the time based weighting is more compressed. The polls with a sample time frame mid point within a 4-day period (instead of 7 days) are averaged and weighted 2.5x the weighted average of the prior 4-day period. Why 2.5x and and 4 days? Because it most accurately approximates the final result in both 2004 and 2008. And my track record is pretty good. In 2004, the composite predicted a Bush win (while others were predicting Kerry), missing only Wisconsin, which I called for Bush with a composite result of less than +2. Kerry won by 0.4. In 2008, I missed Indiana (didn’t everyone?) and North Carolina in my published results. I would have called NC for Obama, but I didn’t believe my numbers, thinking that a recent poll was overweighted – so I adjusted my weighting. I’ve learned my lesson. In 2004, my average in swing states was an underestimate of Bush’s margin by 0.7, with the largest being underestimates of Bush by almost 5 points in Florida and 2 points in Michigan and Minnesota and Ohio. So, the composite poll, in reliance on pollsters and their models, has a slight pro-Democrat bias. In 2008, the pro-Dem bias was smaller in swing states, only 0.4 because, I think, lower turnout in a blow-out election meant that the actual results were closer to the polling data.
I don’t use any special sauce, like Nate Silver at the New York Times and Huffington Post. They both do something – that they don’t fully explain – with historical data related, such as partisan voter index (comparing how a statement like NC is about 8 points more Republican than the national popular vote, for instance), or they weight some pollsters lower because they’ve determined some “house” effects – that Rasmussen, for instance, tends to have a GOP bias (I’ve done my own analysis and find that Rasmussen is actually more inline with final results than the biggest offenders like PPP or Marist). Nevertheless, I don’t adjust weighting because I just don’t like the pollster.
I also conduct a simulation of the electoral vote – some 16,000 trials based on a probability for each state generated from the composite number (as the mean), and the standard deviation of the polling. Because my biggest error was a 5pt swing of Florida in 2004, any state where the composite margin is 6 or greater, that becomes 100% for the candidate leading. I don’t miss those. And if the polls show a lead of 6, there’s no point in generating a probability when we all know who’s gonna win. As a result, Romney has a 0% chance of winning Minnesota, and Obama has a 0% chance of winning Indiana.
So, who’s going to win? My numbers don’t tell you who’s going to win. And the dirty little secret that Nate Silver and the Huffington Post won’t tell you – because they are intent on driving their own narrative – is that their numbers don’t tell you either. Why? Because probabilities based upon normal distributions can’t be said to be statistically correct until you reach have confidence that the predicted result (i.e., n>2PM or n<2PM – where 2PM is two-party margin andnis the mean of the margins). It’s even more difficult when the data is a necessarily inaccurate measure of a non-static thing – in this case, changing opinion. If opinion changes over time – then the mean necessarily moves more slowly than the actual changing opinions (hence my time based weighting). As a result, you’ve have distribution unevenly divided around that mean.
Despite these limitations, I still use the normal distribution because otherwise I’d be guessing where the race is headed by making “adjustments” (cf Nate Silver, who’s polling averages shows a Romney lead in Florida Obama leading after “adjustments”). My numbers are below – and I consider the race a toss-up for the reasons described – and because of my doubt about turnout models that I’ll explain in a later post.
The Composite Poll – the weighted average – shows an Obama win with 303-235 Electoral Votes and a popular vote that’s tied at 48. I’ve also provided numbers assuming that the undecided voters “break” for the challenger 2 to1. Despite this theory of a pro-challenger break, I’ve yet to see it. With that “break”, Obama still wins 290-248.
click to enlarge
Karl Rove and Dick Morris argue that an incumbent will get on election day what he got in the last poll. Well, let’s assume instead that Obama loses every state where his composite number is below 49. In that case, Romney wins 275-263. But with the probabilities projected by the model, the probability for an Obama win is about 80%, and for a tie is less than 1%. Statistically, who wins is a toss-up, but you can confidently say there will be no tie.
click to enlarge