Predicting Employee Turnover: Scoping and Benchmarking the State-of-the-Art
_{_{Simon De Vos, Chris Rickermann, Jente Van Belle, Wouter Verbeke [2024]}}

This paper addresses the need for predictive analytics in workforce management by scoping and benchmarking the state-of-the-art research on employee turnover prediction. Through an extensive benchmarking experiment involving 14 classification methods and 9 datasets, we highlight the challenges posed by inconsistent methodologies and experimental setups in existing studies. Our findings provide a unified perspective to advance both academic research and practical applications in human resource management. The code and public datasets are made available on GitHub to encourage further research and collaboration.

Repository Structure

This repository is organized as follows:

|- data/
    |- ds.csv              # Dataset for experiments
    |- ibm.csv             # IBM HR dataset
    |- kaggle1.csv         # Kaggle dataset 1
    |- kaggle3.csv         # Kaggle dataset 3
    |- kaggle4.csv         # Kaggle dataset 4
    |- kaggle5.csv         # Kaggle dataset 5
|- experiments/
    |- experiment.py       # Script for conducting experiments
    |- main.py             # Main entry point for running experiments
|- performance_metrics/
    |- performance_metrics.py  # Module for evaluating model performance

Installing

We have provided a requirements.txt file:

pip install -r requirements.txt

Please use the above in a newly created virtual environment to avoid clashing dependencies.

Instructions:

In 'main.py':
- Set the project directory to your custom folder. E.g., DIR = r'C:\Users\...\...\...'
- Specify experiment configuration in settings = {'folds': 2, 'repeats': 5, ...}
- Specify dataset used in datasets = {'real1': False, 'ibm': True, ...}. The public datasets can be found in the data folder. The datasets Real1, Real2, and Real3 are not publicly available.
- Specify the classifications methods in methodologies = {'ab': True,'ann': True,'bnb': True, ... }
- Hyperparameter grids can be adapted in hyperparameters = {'ab': {'n_estimators': [50, 100, 200], ...}, 'ann': {...} ...}. It is recommended to put some hyperparameter specifications in comment, as running the current specified grid takes a long time.
Run 'main.py' to reproduce our results. Results will be written to a text file in DIR = r'C:\Users\...\...\...'

Citing

Please cite our paper and/or code as follows:

@article{de2024predicting,
  title={Predicting Employee Turnover: Scoping and Benchmarking the State-of-the-Art},
  author={De Vos, Simon and Bockel-Rickermann, Christopher and Van Belle, Jente and Verbeke, Wouter},
  journal={Business \& Information Systems Engineering},
  pages={1--20},
  year={2024},
  publisher={Springer}
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data		data
experiments		experiments
performance_metrics		performance_metrics
preprocessing		preprocessing
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting Employee Turnover: Scoping and Benchmarking the State-of-the-Art
_{_{Simon De Vos, Chris Rickermann, Jente Van Belle, Wouter Verbeke [2024]}}

Repository Structure

Installing

Instructions:

Citing

About

Releases

Packages

Languages

License

SimonDeVos/turnover_prediction

Folders and files

Latest commit

History

Repository files navigation

Predicting Employee Turnover: Scoping and Benchmarking the State-of-the-ArtSimon De Vos, Chris Rickermann, Jente Van Belle, Wouter Verbeke [2024]

Repository Structure

Installing

Instructions:

Citing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Predicting Employee Turnover: Scoping and Benchmarking the State-of-the-Art
_{_{Simon De Vos, Chris Rickermann, Jente Van Belle, Wouter Verbeke [2024]}}

Packages