Project name: peopleanalytics-python
This package is port of an R package associated with the free online book Handbook of Regression Modeling in People Analytics by Keith McNulty. Some additional information about the inspiration here.
At peopleanalytics-regression-book.org, McNulty makes the data referenced in Handbook of Regression Modeling in People Analytics available via an R package. McNulty explains:
For R and Python users, each of the data sets used in this book can be downloaded individually by following the code in each chapter. Alternatively for R users who intend to work through all of the chapters, all data sets can be loaded into an R session in advance by installing and loading the peopleanalyticsdata R package.
This package brings the functionality of McNulty's R package to Python users. As an initial idea, following a pip install ...
will make these data sets accessible for Python users. These data sets can also be used as an alternative to the very common public data sets (iris, wine quality, etc) available on UCI repository to understand the concepts of exploratory data analysis and predictive modeling.
# import peopleanalyticsdata package
import peopleanalyticsdata as pad
import pandas as pd
# see a list of data sets
pad.list_sets()
# pad.help(managers)
# load data into a dataframe
df = pad.managers()
# find out more about a specific data set ('managers' example)
pad.managers().info()
The data dictionary pertinent to all the data sets can be found here.
- MIT