This project investigates the use of node pruning within a stratified k-fold cross-validation framework to improve the recall of CNNs in medical imaging tasks.
- Node Pruning Implementation: Reduces model complexity and enhances recall
- ResNet50V2 Base Model: Utilizes a state-of-the-art architecture for binary classification of skin lesions.
- Comprehensive Evaluation: Assesses performance metrics across multiple pruning strategies.
notebook/
: Contains the Colab notebook for experiments.data/
: Includes relevant datasets used for training and evaluation.figures/
: Visualization assets like bar plots and scatter plots.README.md
: Project overview and usage instructions.
Pruning strategies improved recall while maintaining AUC. Key metrics are summarized below:
Pruning Strategy | Accuracy Mean | AUC Mean | Precision Mean | Recall Mean |
---|---|---|---|---|
Baseline | 0.7323 | 0.8131 | 0.7798 | 0.6540 |
Pruned_All | 0.6237 | 0.8028 | 0.5950 | 0.9167 |
Pruned_Top1 | 0.7359 | 0.8119 | 0.7591 | 0.6978 |
Pruned_Top5 | 0.7191 | 0.8113 | 0.6958 | 0.7987 |
Pruned_Top10 | 0.6825 | 0.8107 | 0.6397 | 0.8821 |
Pruned_Top15 | 0.6377 | 0.8061 | 0.6036 | 0.9101 |
To reproduce the results, you’ll need to download the dataset from the Kaggle ISIC 2024 Challenge. Follow these steps:
-
Install the Kaggle API:
pip install kaggle
-
Obtain your Kaggle API key:
- Go to Kaggle.
- Navigate to your account settings.
- Select "Create New API Token" to download the
kaggle.json
file.
-
Place the
kaggle.json
file in the appropriate directory:mkdir ~/.kaggle mv /path/to/kaggle.json ~/.kaggle/ chmod 600 ~/.kaggle/kaggle.json
-
Download the dataset using the Kaggle API:
kaggle competitions download -c isic-2024-challenge
-
Extract the dataset:
unzip <downloaded-dataset>.zip -d data/
-
Clone this repository:
git clone https://github.com/DevDizzle/node-pruning-research.git
-
Navigate to the project directory:
cd node-pruning-research
-
Install the required dependencies:
pip install -r requirements.txt
-
Open the notebook:
- Navigate to
notebook/node-pruning-cnn-analysis.ipynb
and open it in Jupyter Notebook or Google Colab.
- Navigate to
-
Reproduce the experiments:
- Follow the instructions within the notebook to run the experiments.
The following visualizations are included in the repository:
- Improved Recall: Node pruning, especially the
Pruned_All
andPruned_Top10
strategies, significantly improved recall compared to the baseline. - Precision-Recall Trade-off: The trade-off between precision and recall was effectively managed across pruning strategies.
If you use this repository, please cite:
@misc{nodepruningresearch,
author = {Evan R. Parra},
title = {Enhancing Recall in Skin Lesion Classification with Node Pruning},
year = {2024},
howpublished = {\url{https://github.com/DevDizzle/node-pruning-research}}
}