The task is to build a machine learning regression model will predict the number of absent hours. As Employee absenteeism is a major problem faced by every employer which eventually lead to the backlogs, piling of the work, delay in deploying the project and can have a major effect on company finances. The aim of this project is to find an issue which eventually leads toward the absence of an employee and provide a proper solution to reduce the absenteeism.
Dataset Characteristics: Timeseries Multivariant
Number of Attributes: 21
===============================
- Individual identification (ID)
- Reason for absence (ICD).
- Month of absence
- Day of the week (2. MONDAY 3. TUESDAY 4. WEDNESDAY 5. THURSDAY 6. FRIDAY)
- Seasons (1. SUMMER 2. AUTUMN 3. WINTER 4. SPRING)
- Transportation expense
- Distance from Residence to Work (kilometres)
- Service time
- Age
- Workload Average/day
- Hit target
- Disciplinary failure (yes=1; no=0)
- Education (HIGH SCHOOL (1), GRADUATE (2), POSTGRADUATE (3), MASTER AND DOCTOR (4))
- Son (number of children)
- Social drinker (yes=1; no=0)
- Social smoker (yes=1; no=0)
- Pet (number of pet)
- Weight
- Height
- Body mass index
- Absenteeism time in hours
Please refer the project report