This repository contains a data science case study that explores the relationship between political alignment (conservative, moderate, or liberal) and other attitudes and beliefs. It is meant primarily for teaching and learning about data science, but we might do some real research along the way.
If you are working through the Elements of Data Science curriculum, you should be ready to start this case study when you have completed Notebook 6, which covers basic Pandas.
This material is a work in progress, so your feedback is welcome. The best way to provide that feedback is to click here and create an issue in this GitHub repository.
For each of the notebooks below, you have two options: if you view the notebook on NBViewer, you can read it, but you can't run the code. If you run the notebook on Colab, you'll be able to run the code, do the exercises, and save your modified version of the notebook in a Google Drive (if you have one).
Cleaning and validation: The first notebook loads data from the General Social Survey (GSS) and walks through the process of cleaning and validating the data. At the end, you can help me by choosing a random variable, checking the values against the codebook, and reporting your results.
Cleaning and validation: This notebook uses the tools of exploratory data analysis to look at survey responses about political alignment. It uses PMFs to display distributions, time series to represent changes over time, and cross tabulation to look at changes in distribution over time. It also introduces local regression as a way too plot a smooth line through noisy data.
Political alignment and outlook: This notebook explores the relationship between political alignment and three survey questions related to "outlook". It use a pivot table to compute the mean of the response variable, grouped by political alignment and time.
Political alignment and other beliefs: This notebook explores the relationship between political alignment and other attitudes and beliefs. It is a template for a do-it-yourself, choose-your-own-adventure mini-project, where you have the chance to explore a variable in the GSS dataset and report the results.