Ensemble Feature Importance Ranking (EFIR) Honors College Thesis

Data science is a rapidly growing industry permeating throughout every aspect of society. Everything collects information these days, and this data can be used to find meaningful patterns leading to benefits ranging from more intuitive marketing to better cancer detection. However, increased data collection leads to increased complexity, and data science works to manage this complexity through various techniques and machine learning/artificial intelligence models. Two major problems faced by data scientists are too many features in a dataset and long model training times. To help combat these issues, a tool called Ensemble Feature Importance Ranking (EFIR) was created. This repository holds the code used to test EFIR on various datasets and analyze the results via facet grids.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Ensemble Feature Importance Ranking (EFIR) Honors College Thesis

Files

README.md

Latest commit

History

README.md

File metadata and controls

Ensemble Feature Importance Ranking (EFIR) Honors College Thesis