Skip to content

NikitaRubocki/thesis

Repository files navigation

Ensemble Feature Importance Ranking (EFIR) Honors College Thesis

Data science is a rapidly growing industry permeating throughout every aspect of society. Everything collects information these days, and this data can be used to find meaningful patterns leading to benefits ranging from more intuitive marketing to better cancer detection. However, increased data collection leads to increased complexity, and data science works to manage this complexity through various techniques and machine learning/artificial intelligence models. Two major problems faced by data scientists are too many features in a dataset and long model training times. To help combat these issues, a tool called Ensemble Feature Importance Ranking (EFIR) was created. This repository holds the code used to test EFIR on various datasets and analyze the results via facet grids.

About

Feature Analysis thesis code and scripts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages