Skip to content

Latest commit

 

History

History
3 lines (2 loc) · 860 Bytes

README.md

File metadata and controls

3 lines (2 loc) · 860 Bytes

Ensemble Feature Importance Ranking (EFIR) Honors College Thesis

Data science is a rapidly growing industry permeating throughout every aspect of society. Everything collects information these days, and this data can be used to find meaningful patterns leading to benefits ranging from more intuitive marketing to better cancer detection. However, increased data collection leads to increased complexity, and data science works to manage this complexity through various techniques and machine learning/artificial intelligence models. Two major problems faced by data scientists are too many features in a dataset and long model training times. To help combat these issues, a tool called Ensemble Feature Importance Ranking (EFIR) was created. This repository holds the code used to test EFIR on various datasets and analyze the results via facet grids.