This repository contains the code for running a dynamic and hierarchical Bayesian model that forecasts election outcomes in states, the nation, and the electoral college. The model is written in Stan and the supporting pipeline is written in R.
The model improves upon the Economist’s 2020
model (which, in turn,
improved upon Pierre Kemp’s
implementation
of Drew Linzer’s
model)
by estimating the parameters used to generate state covariance matrices,
rather than being passed the matrices as data. I intend to write a
formal explanation of the model, likely after the election has
concluded. In the interim, you can view a brief overview of the model
definition in the README in the stan/
folder.
A more general overview of the model methodology can be found here, and the full output can be explored here.
- Added a paramter,
$\beta_i$ , to model the effect of internal polls. -
$\beta_i$ is separate from$\beta_c$ , which models the effect of party-sponsored polls.
- Use
pollster_rating_name
rather thanpollster
to uniquely identify pollsters. - This coalesces pollsters who appear under multiple name into one (e.g., HarrisX and HarrisX/Harris Poll now both fall under Harris Insights & Analytics, rather than being considered separate pollsters).
- This affects the following pollsters:
- SurveyUSA: previously SurveyUSA or SurveyUSA/High Point University
- Quantus Insights: previously Quantus Insights or Quantus Polls and News
- YouGov: previously YouGov or YouGov Blue
- Harris Insights & Analytics: previously HarrisX or HarrisX/Harris Poll
- Change Research: previously Change Research or Embold Research
- SoCal Research: previously SoCal Research or SoCal Strategies
- Fabrizio, Lee & Associates: previously Fabrizio or Fabrizio Ward
- The Tyson Group: previously The Tyson Group or P2 Insights
- Parameter outputs for the poll model are now saved to the
out/polls/
directory:- beta_b: state-level polling bias
- beta_c: party-sponsor polling bias
- beta_g: group (population) polling bias
- beta_m: mode polling bias
- beta_p: pollster bias
- Included Shiva Ayyadurai and Lars Mapstead as allowed candidates. Polls including them will now be included in the model.
- Included Joseph Kishore as an allowed candidate. Polls including him will now be included in the model.
- Included Claudia De la Cruz as an allowed candidate. Polls including her will now be included in the model.
- Updated state and national headline text to refer to raw probability, rather than ratings.
- Removed rating textbar above state map.
- Set state bubble color and summary bar fill based on a continuous gradient scaled by candidate probability of winning.
- Added linked interactivity between map and bar.
- Updated css to
cursor:pointer
when hovering over the map/bar to more intuitively suggest that users can click and be taken to the state pages.
- Added ActiVote to the banned pollsters list.
- Added a filter to drop poll questions with a
NA
sample size.- This was previously an implicit drop.
- If a poll included multiple populations, but had incomplete sample size data for the best population, the entire poll could have been dropped.
- This ensures that the poll is included provided at least one of the populations has sample size data.
- Added a model review/diagnostic display under
out/REVIEW.md
- Added conditional probability plot functions for the National page.
- Modified the prior/polling stan models and supporting R pipelines to
support Kamala Harris as the democratic candidate. More details can be
found in the
stan/
directory README. - Updated site functions’ internal variable naming and public facing candidate names to refer to Kamala Harris.
- Updated time-series plots on site to only display projections from 8/1 onwards.
- Updated function documentation and READMEs based on new output.
- Fully re-ran the entire model from 5/1 onwards. An archived version of
the final run with Biden as the democratic candidate can be found in
the
archive-biden
branch in this repository.
- Corrected the need to run the polling model with the
prior_check
flag for run dates before 5/7. (#10) render_interactive_map()
now correctly renders the ggobj passed as an argument, rather than looking for a specific object in the global environment.- Fixed a small bug causing the prior stan model to underestimate uncertainty in the combination of the estimated national voteshare and state partisan lean.
- Added conditional probability as a poll model output.
- The output is not yet made available in the UI, but will be as a result of a later release.
- Modified header text to include the article “the” on D.C.’s state page.
- Initial release
- FiveThirtyEight
- The Economist
- JHKForecasts
- DDHQ
- Silver Bulletin (requires subscription to view output)