DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut
conda create -n diffcut python=3.10
conda activate diffcut
pip install -r requirements.txt
For evaluation, install detectron2
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
Try our DiffCut method by running the notebook diffcut.ipynb
Visualize the semantic coherence of vision encoders (SD, CLIP, DINO...) with semantic_coherence.ipynb
In the paper, we evaluate DiffCut on 6 benchmarks: PASCAL VOC (20 classes + background), PASCAL Context (59 classes + background), COCO-Object (80 classes + background), COCO-Stuff (27 classes), Cityscapes (27 classes) and ADE20k (150 classes). See Preparing Datasets for DiffCut.
python eval_diffcut.py --dataset_name Cityscapes --tau 0.5 --alpha 10 --refinement
python eval_diffcut_openvoc.py --dataset_name VOC20 --tau 0.5 --alpha 10 --refinement
@inproceedings{
couairon2024diffcut,
title={DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut},
author={Paul Couairon and Mustafa Shukor and Jean-Emmanuel HAUGEARD and Matthieu Cord and Nicolas THOME},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=N0xNf9Qqmc}
}
This repo relies on the following projects:
Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion
Emergent Correspondence from Image Diffusion
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
Cut and Learn for Unsupervised Image & Video Object Detection and Instance Segmentation