Skip to content

Tilburg University Data Science And Society MSc Thesis 2022 spring

Notifications You must be signed in to change notification settings

balazsgonczy/tiu_dss_msc_thesis

Repository files navigation

tiu_dss_msc_thesis_2022_spring

Thesis Title: TABULAR EMPLOYEE RETENTION PREDICTION

Author: Balázs Gönczy LinkedIn: linkedin.com/in/balázs-g-5b1445161

Supervisor: dr. Grzegorz Chrupala LinkedIn: linkedin.com/in/gchrupala

Abstract:

Tabular data is still an unconquered area within the field of deep learning where professionals’ research like Shwartz-Ziv’s shows that advanced decision trees dominate the prediction tasks (Shwartz Ziv & Armon, 2022). Deep learning models like TABNET try to improve upon this prediction standard in this context. According to other studies, broadly varying results can be found claiming Xgboost or other deep learning algorithms to be the best performing models. Also, in the case of tabular employee retention prediction, Human Resource (HR) professionals usually have a limited amount of data. Here the application of deep learning models shows room for performance task improvement. Therefore, the current study performs the Random Forest, Lightgbm, Xgboost and TABNET models on three publicly available imbalanced employee retention datasets limited in size to investigate their weighted f1-score (RQ1) (Davin, Wijaya, 2020; Möbius, 2021; Pavan, Subhash, 2017). Alongside the application of these models, Explainable Artificial Intelligence (XAI) methodologies are also applied within this study, like Permutated Feature Importance (PFI – RQ2) metrics and Partial Dependence Plots (PDP – RQ3). Using these XAI tools, the global and local interpretability of the models can be enhanced further. The results suggest that there is not any clear dominating model.

Table of Contents:

  1. Google Colab files:
  • EDA
  • Main analysis workflow
  1. Excel table:
  • RQ2 tables

How to Use the Project:

The text of the Thesis will be published openly at the website of the Tilburg University's Library later at this link: "Coming soon..."
Feel free to run the Google Colab file and reference the work if you would like to use it later as follows:

@online{,
author = {Balázs Gönczy},
title = {{TABULAR EMPLOYEE RETENTION PREDICTION}},
year = {2022},
url = {https://github.com/balazsgonczy/tiu_dss_msc_thesis},
}

Credits:

Find the links of the code sources used in the Colab file, in above each code cell, where it is applicable.

Final remarks:

In case you might be interested in future cooperation, I am open for economics related data science projects!

About

Tilburg University Data Science And Society MSc Thesis 2022 spring

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published