Code for "Improving Medical Multi-modal Contrastive Learning with Expert Annotations" accepted at ECCV 2024.
Read the full paper here: https://arxiv.org/abs/2403.10153
We use the eclip
package for training and evaluation. To install it run pip install -e .
from the root folder and it installs all the dependencies.
We use pytorch-lightning
to train the model and hydra
to handle the config files.
CLIP/eCLIP training can be done by running by passing the appropriate hydra flags for config. Toggle the use_expert
flag to switch between eCLIP and CLIP pretraining.
python eclip/train.py data=data_default hydra=hydra_default \
use_expert=true \
model="model_default" \
batch_size=64 \
scheduler_name="cosine" \
max_length=256 \
precision="32" \
learning_rate=1e-4 \
weight_decay=1e-3 \
max_steps=200 \
val_check_interval=20 \
limit_val_batches=10 \
wandb_project_name="eclip-debug" \
num_gpus=8
The evaluation modules are in eclip/eval
. For example to report zero-shot performance on Chexpert, run the following after updating the appropriate paths
python eclip/eval/eval_classification_chexpert.py
It should print the ZS accuracy and F1 scores for the models
We use Mistral 7B Instruction
for generating radiology reports using the input image. This utilizes Retrieval Augmented Generation (RAG) and uses nearest neighbors to pick some relevant reports and injects them in the LLM prompt.
- Hardware -- AMD MI150X GPUs provided by LUMI supercomputer
- Training -- Pytorch, Transformers and Lightning
- Tracking -- Weights & Biases
- Config Management -- Hydra
- Data loading -- Webdataset