Part II of the second data science project completed for Statistical Methods I. National Health & Nutrition Examination Survey 2017-2022 (NHANES 2017-2022) data was used to build and compare two linear regression models predicting HDL cholesterol in females ages 30 - 55.
Skills Demonstrated:
- Clean data set, investigating and making appropriate decisions about missing data
- Create well-labeled and attractive visualizations of outcome, investigating potential transformations of that outcome
- Define a research question related to how effectively your key predictor predicts your quantitative outcome, while (possibly) adjusting for the other predictors
- Develop two competitive linear regression models - one with key predictor and another with additional predictors
- Use statistical model to make predictions and assess the quality of those predictions
- Assess model performance including predictive performance, adherence to assumptions, and predictive quality
- Describe study limitations and next steps