Skip to content

Project exploring the impact of node pruning on CNN performance metrics within a stratified k-fold cross-validation framework. Focuses on optimizing recall for medical imaging tasks while analyzing trade-offs in precision and accuracy

License

Notifications You must be signed in to change notification settings

DevDizzle/node-pruning-research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Enhancing Recall in Skin Lesion Classification with Node Pruning

This project investigates the use of node pruning within a stratified k-fold cross-validation framework to improve the recall of CNNs in medical imaging tasks.

Key Features

  • Node Pruning Implementation: Reduces model complexity and enhances recall
  • ResNet50V2 Base Model: Utilizes a state-of-the-art architecture for binary classification of skin lesions.
  • Comprehensive Evaluation: Assesses performance metrics across multiple pruning strategies.

Repository Structure

  • notebook/: Contains the Colab notebook for experiments.
  • data/: Includes relevant datasets used for training and evaluation.
  • figures/: Visualization assets like bar plots and scatter plots.
  • README.md: Project overview and usage instructions.

Results Summary

Pruning strategies improved recall while maintaining AUC. Key metrics are summarized below:

Pruning Strategy Accuracy Mean AUC Mean Precision Mean Recall Mean
Baseline 0.7323 0.8131 0.7798 0.6540
Pruned_All 0.6237 0.8028 0.5950 0.9167
Pruned_Top1 0.7359 0.8119 0.7591 0.6978
Pruned_Top5 0.7191 0.8113 0.6958 0.7987
Pruned_Top10 0.6825 0.8107 0.6397 0.8821
Pruned_Top15 0.6377 0.8061 0.6036 0.9101

Data Acquisition

To reproduce the results, you’ll need to download the dataset from the Kaggle ISIC 2024 Challenge. Follow these steps:

  1. Install the Kaggle API:

    pip install kaggle
  2. Obtain your Kaggle API key:

    • Go to Kaggle.
    • Navigate to your account settings.
    • Select "Create New API Token" to download the kaggle.json file.
  3. Place the kaggle.json file in the appropriate directory:

    mkdir ~/.kaggle
    mv /path/to/kaggle.json ~/.kaggle/
    chmod 600 ~/.kaggle/kaggle.json
  4. Download the dataset using the Kaggle API:

    kaggle competitions download -c isic-2024-challenge
  5. Extract the dataset:

    unzip <downloaded-dataset>.zip -d data/

How to Run

  1. Clone this repository:

    git clone https://github.com/DevDizzle/node-pruning-research.git
  2. Navigate to the project directory:

    cd node-pruning-research
  3. Install the required dependencies:

    pip install -r requirements.txt
  4. Open the notebook:

  5. Reproduce the experiments:

    • Follow the instructions within the notebook to run the experiments.

Visualizations

The following visualizations are included in the repository:

  • Bar Plot: Average recall across pruning strategies
    Bar Plot

  • Box Plot: Distribution of recall across strategies
    Box Plot

  • Scatter Plot: Precision vs. Recall
    Scatter Plot

Insights

  • Improved Recall: Node pruning, especially the Pruned_All and Pruned_Top10 strategies, significantly improved recall compared to the baseline.
  • Precision-Recall Trade-off: The trade-off between precision and recall was effectively managed across pruning strategies.

Citation

If you use this repository, please cite:

@misc{nodepruningresearch,
  author       = {Evan R. Parra},
  title        = {Enhancing Recall in Skin Lesion Classification with Node Pruning},
  year         = {2024},
  howpublished = {\url{https://github.com/DevDizzle/node-pruning-research}}
}

About

Project exploring the impact of node pruning on CNN performance metrics within a stratified k-fold cross-validation framework. Focuses on optimizing recall for medical imaging tasks while analyzing trade-offs in precision and accuracy

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published