Skip to content

A library that integrates different MIL methods into a unified framework

Notifications You must be signed in to change notification settings

lingxitong/MIL_BASELINE

Repository files navigation

MIL_BASELINE

With the rapid advancement of computational power and artificial intelligence technologies, computational pathology has gradually been regarded as the most promising and transformative auxiliary diagnostic paradigm in the field of pathology diagnosis. However, due to the fact that pathological images often consist of hundreds of billions of pixels, traditional natural image analysis methods face significant computational and technical limitations. Multiple Instance Learning (MIL) is one of the most successful and widely adopted paradigms for addressing computational pathology analysis. However, current MIL methods often adopt different frameworks and structures, which poses challenges for subsequent research and reproducibility. We have developed the MIL_BASELINE library with the aim of providing a fundamental and simple template for Multiple Instance Learning applications.
News of MIL-Baseline

2025-1-10 fix bug of MIL_BASELINE, update visualization tools, add new MIL methods, add new dataset split methods

2024-11-24 update mil-finetuning (gate_ab_mil,ab_mil) for rrt_mil

2024-10-12 fix bug of Ctranspath feature encoder

2024-10-02 add FR_MIL Implement

2024-08-20 fix bug of early-stop

2024-07-27 fix bug of plip-transforms

2024-07-21 fix bug of DTFD-MIL fix bug of test_mil.py

2024-07-20 fix bug of all MIL-models expect DTFD-MIL

📝 Overall Introduction

🔖 Library Introduction

  • A library that integrates different MIL methods into a unified framework
  • A library that integrates different Datasets into a unified Interface
  • A library that provides different Datasets-Split-Methods which commonly used
  • A library that easily extend by following a uniform definition

💡 Dataset Uniform Interface

  • User only need to provide the following csvs whether Public/Private Dataset
    /datasets/example_Dataset.csv

🌂 Supported Dataset-Split-Method

  • User-difined Train-Val-Test split
  • User-difined Train-Val split
  • User-difined Train-Test split
  • Train-Val split with K-fold
  • Train-Val-Test split with K-fold
  • Train-Val with K-fold then test
  • The difference between the different splits is in /split_scripts/README.md

📐 Feature Encoder

💎 Implementated NetWork

☑️ Implementated Metrics

  • AUC: macro,micro,weighed (same when 2-classes)
  • F1,PRE,RECALL: macro,micro,weighed
  • ACC,BACC: BACC is macro-RECALL
  • KAPPLE: linear,quadratic
  • Confusion_Mat

📙 Let's Begin Now

🔨 Code Framework

MIL_BASELINE is constructed by the following parts:

  • /configs: MIL_BASELINE defines MIL models through a YAML configuration file.
  • /modules: Defined the network architectures of different MIL models.
  • /process: Defined the training frameworks for different MIL models.
  • /feature_extractor: Supports different feature extractors.
  • /split_scripts: Supports different dataset split methods.
  • /vis_scripts: Visualization scripts for TSNE and Attention.
  • /datasets: User-Datasets path information.
  • /utils: Framework's utility scripts.
  • train_mil.py: Train Entry function of the framework.
  • test_mil.py: Test Entry function of the framework.

📁 Dataset Pre-Process

Feature Extracter

Supported formats include OpenSlide and SDPC formats. The following backbones are supported: R50, VIT-S, CTRANSPATH, PLIP, CONCH, UNI, GIGAPATH, VIRCHOW, VIRCHOW-V2 and CONCH-V1.5. Detailed usage instructions can be found in /feature_extractor/README.md.

Dataset-Csv Construction

You should construct a csv-file like the format of /datasets/example_Dataset.csv

Dataset-Split Construction

You can use the dataset-split-scripts to perform different dataset-split, the detailed split method descriptions are in /split_scripts/README.md.

🔥 Train/Test MIL

Yaml Config

You can config the yaml-file in /configs. For example, /configs/AB_MIL.yaml, A detailed explanation has been written in /configs/AB_MIL.yaml.

Train & Test

Then, /train_mil.py will help you like this:

python train_mil.py --yaml_path /configs/AB_MIL.yaml 

We also support dynamic parameter passing, and you can pass any parameters that exist in the /configs/AB_MIL.yaml file, for example:

python train_mil.py --yaml_path /configs/AB_MIL.yaml --options General.seed=2024 General.num_epochs=20 Model.in_dim=768

The /test_mil.py will help you test pretrained MIL models like this:

python test_mil.py --yaml_path /configs/AB_MIL.yaml --test_dataset_csv /your/test_csv/path --model_weight_path /your/model_weights/path --test_log_dir /your/test/log/dir

You should ensure the --test_dataset_csv contains the column of test_slide_path which contains the /path/to/your_pt.pt. If --test_dataset_csv also contains the 'test_slide_label' column, the metrics will be calculated and written to logs.

Visualization

You can easily visualize the dimensionality reduction map of the features from the trained MIL model and the distribution of attention scores (or importance scores) by /vis_scripts/draw_feature_map.py and /vis_scripts/draw_attention_map.py. We have implemented standardized global feature and attention score output interfaces for most models, making the above visualization scripts compatible with most MIL model in the library. The detailed usage instructions are in /vis_scripts/README.md.

🍻 Acknowledgement

Thanks to the following repositories for inspiring this repository

Git Pull

Personal experience is limited, and code submissions are welcome

About

A library that integrates different MIL methods into a unified framework

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages