By J. F. Kolacinski,
and B. Payne
Elmira College
With 108 out of 110 presidential primaries and caucuses complete (the District of Columbia holds its Republican and Democratic primaries on June 14th), attention is turning to the November general election. Pollsters from across the country have already begun polling voters in more than half of the states, asking if the election were held today, for whom would you vote?
Currently, Donald Trump is the presumptive nominee for the Republican Party, while it appears that Hillary Clinton has secured enough delegates to win the Democratic nomination. This is less certain on the Democratic side; Secretary Clinton’s nomination is dependent on superdelegates who are free to change their support up until the Democratic convention in July.
Senator Sanders’ only remaining path to the nomination is to convince enough Clinton superdelegates to switch their support. Although he may be backing away from this now, he has been making an electability argument; that he has a better chance of beating Mr. Trump in the general election. Most head to head match-ups do seem to support this; Senator Sanders consistently polls better against Mr. Trump than does Secretary Clinton. Some reasonable questions to ask then are the following. What are the probabilities that Secretary Clinton and Senator Sanders defeat Mr. Trump in the general election? Also, is Senator Sanders’ probability of defeating Donald Trump in the fall significantly better than Secretary Clinton’s?
To investigate these questions, we simulated a number of national elections state-by-state and used the relative frequencies of each candidate’s victories to estimate their probability of victory. To accomplish this, poll and election data was entered into an application originally written by Kolacinski and Culpepper in 2011, the “U.S. Presidential Election Calculator”
< www.maplesoft.com/...>. This application runs on Maple, a professional-grade mathematics program.
When originally published, Kolacinski and Culpepper tested the application against the predictions of the 2008 election at Nate Silver’s fivethirtyeight.com. The day before the election, Silver ran 10,000 simulations and predicted that Barack Obama had a 98.9% probability of winning with an average of 348.6 electoral votes for Obama and 189.4 electoral votes for McCain. Using the same data the Maple application estimated a 100% probability of Obama defeating John McCain in the general election with a mean electoral vote count of 349.0 to 189.0. These results are reassuringly close considering that Silver’s model is more robust and used ten times as many trials.
In each of the following simulations, poll data from www.realclearpolitics.com (retrieved 7 June 2016) was used to estimate the percentage of each state’s electorate that would support each candidate (either in an election between Hillary Clinton and Donald Trump or Bernie Sanders and Donald Trump). When more than one poll was conducted in a state, the average result between all of the polls was used. Additionally, all poll data was adjusted to eliminate the possibility that a third-party candidate could win the race using the following formula:
% for Candidate A/(% for Candidate A + % for Candidate B).
Polling data was only available for approximately half of the states (30 for Clinton vs. Trump and 24 for Sanders vs. Trump). In all states where polls were not conducted, popular vote data from the last four presidential elections were collected for each state, as published by www.270towin.com. The percentage of the popular vote for each party’s candidate was then averaged and adjusted using the formula above. Data from the last four elections was used to represent an equal number of victories for each party in the general election (two for Republicans, two for Democrats). Additionally, by taking the average result from each state, the effects of a single potentially idiosyncratic election are mitigated.
Using the polling data from realclearpolitics.com and the voting data from 270towin.com, the Presidential Election calculator estimated that both Hillary Clinton and Bernie Sanders had a 100% chance of winning an election against the Republican nominee Donald Trump. The probability of each candidate winning, the number of trials used within the simulation, and the mean number of electoral votes won by each candidate are all listed below.
Table 1: Election Estimates with 10, 25, 100 and 1000 trials
|
|
% Win
|
Mean EV
|
% Win
|
Mean EV
|
% Win
|
Mean EV
|
% Win
|
Mean EV
|
|
10 Trials
|
25 Trials
|
100 Trials
|
1000 Trials
|
Clinton
|
100 %
|
333.2
|
100 %
|
341.2
|
100%
|
336.9
|
100%
|
337.1
|
Trump
|
0%
|
204.8
|
0%
|
196.8
|
0%
|
201.1
|
0%
|
200.9
|
|
|
|
|
|
|
|
|
|
Sanders
|
100%
|
370.1
|
100%
|
372.4
|
100%
|
369.6
|
100%
|
368.3
|
Trump
|
0%
|
167.9
|
0%
|
165.6
|
0%
|
168.4
|
0%
|
169.7
|
These simulations reveal several critical pieces of information about the 2016 presidential election. First, as the polls stand now, both Hillary Clinton and Bernie Sanders have a 100% chance of winning the election against Donald Trump, thus we cannot conclude that Senator Sanders has a higher probability of defeating Mr. Trump than does Senator Clinton.
However, although the data shows that Clinton and Sanders currently have an equal probability of defeating Trump, Sanders is likely to win more decisively. In the 1000-trial case above, Sanders beats Trump by an average of 198.6 electoral votes while Clinton’s margin of victory averages “only” 136.2. In addition, Sanders amasses an average of 31.2 more electoral votes than Clinton. Thus there remains an argument which can be made that Sanders is the stronger general election candidate.
Given that both Clinton and Sanders have a 100% chance of winning the general election it makes sense to drill down into the data and ask how these probabilities would change if the polls were to shift in Trump’s favor.
In the table below, we shift the polls by the noted amount in Trump’s favor, so in the case of the 1% shift below, the margin of victory would close by two percentage points. The same shift is applied to each state in the simulation.
Table 2: Election Estimates, 1000 trials with polls shifted
|
|
% Win
|
Mean EV
|
% Win
|
Mean EV
|
% Win
|
Mean EV
|
% Win
|
Mean EV
|
|
+1 Trump
|
+2 Trump
|
+3 Trump
|
+4 Trump
|
Clinton
|
98.3 %
|
333.2
|
70.8 %
|
282.9
|
19.1%
|
252.3
|
1.1%
|
222.0
|
Trump
|
1.6%
|
204.8
|
27.4%
|
255.1
|
80.0%
|
285.7
|
98.8%
|
316.0
|
Tie
|
0.1%
|
|
1.8%
|
|
0.9%
|
|
0.1%
|
|
|
|
|
|
|
|
|
|
|
Sanders
|
100%
|
357.6
|
100%
|
330.0
|
99.1%
|
311.3
|
72.2%
|
280.6
|
Trump
|
0%
|
180.4
|
0%
|
208.0
|
0.5%
|
226.7
|
26.2%
|
257.4
|
Tie
|
0%
|
|
0%
|
|
0.4%
|
|
1.6%
|
|
Thus it appears that Sanders’ likelihood of defeating Trump is more robust than Clinton’s as it could withstand a larger polling shift. When matched against Clinton, even a small shift in Trump’s direction gives him a non-zero probability of victory while a shift of about 2.5% makes him the probable victor. In a Sanders vs. Trump election, the polls would need to shift between 4.5% and 5% in Trump’s favor before his probability of victory would reach 50%, a similar shift against Clinton would virtually guarantee Trump the election.
It is important that we don’t read too much into these results. It is certain that the polls will shift as the election progresses, but it’s impossible to actually predict how large the shift will be or in which direction. The assumption that the shift will be uniform across every state makes a certain amount of sense on average, but for that to actually occur seems a near impossibility. The side-by-side comparison of the two democratic candidates itself contains an unstated assumption; that the changes in the polls will stay consistent regardless of which Democratic candidate makes it to the general election. In reality the polls could shift much more or much less if Sanders is the nominee instead of Clinton.
Though these results are rather persuasive in nature, it must be noted that not all states had polling data available for use here, resulting in the use of previous election data instead. As the election approaches, more polls will be conducted, particularly in battleground states, adding new data to the pool. This is also true for states that have already completed polls; as the election approaches, there is a greater likelihood that the most recent poll data will more accurately represent the electorate’s opinions toward each candidate.