Official implementation for “MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion.”
- [2023-4] Codes and config files are public available.
@article{li2024mambadfuse,
title={MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion},
author={Li, Zhe and Pan, Haiwei and Zhang, Kejia and Wang, Yuhua and Yu, Fengming},
journal={arXiv preprint arXiv:2404.08406},
year={2024}
}
Multi-modality image fusion (MMIF) aims to integrate complementary information from different modalities into a single fused image to represent the imaging scene and facilitate downstream visual tasks comprehensively. In recent years, significant progress has been made in MMIF tasks due to advances in deep neural networks. However, existing methods cannot effectively and efficiently extract modality-specific and modality-fused features constrained by the inherent local reductive bias (CNN) or quadratic computational complexity (Transformers). To overcome this issue, we propose a Mamba-based Dual-phase Fusion (MambaDFuse) model. Firstly, a dual-level feature extractor is designed to capture long-range features from single-modality images by extracting low and highlevel features from CNN and Mamba blocks. Then, a dual-phase feature fusion module is proposed to obtain fusion features that combine complementary information from different modalities. It uses the channel exchange method for shallow fusion and the enhanced Multi-modal Mamba (M3) blocks for deep fusion. Finally, the fused image reconstruction module utilizes the inverse transformation of the feature extraction to generate the fused result. Through extensive experiments, our approach achieves promising fusion results in infrared-visible image fusion and medical image fusion. Additionally, in a unified benchmark, MambaDFuse has also demonstrated improved performance in downstream tasks such as object detection.
Our MambaDFuse is implemented in models/network.py
.
conda create -n MambaDFuse python=3.8.18
conda activate MambaDFuse
pip install torch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117
pip install causal_conv1d==1.0.0 # causal_conv1d-1.0.0+cu118torch1.13cxx11abiFALSE-cp38-cp38-linux_x86_64.whl
pip install mamba_ssm==1.0.1 # mamba_ssm-1.0.1+cu118torch1.13cxx11abiFALSE-cp38-cp38-linux_x86_64.whl
The .whl files of causal_conv1d and mamba_ssm could be found here(Baidu). After installing the Mamba library, replace the mamba_simple.py file in the installation directory with the ./mamba_simple.py
in this repository. The implementation of the Multi-modal Mamba Block (M3 Block) is located in this file.
Checkpoints arelocated in the folder ./Model/Infrared_Visible_Fusion/Infrared_Visible_Fusion/models
MSRS, RoadScene, M3FD, Harvard medical dataset
Download the Infrared-Visible Fusion (IVF) and Medical Image Fusion (MIF) dataset and place the paired images in the folder './Datasets'
. Such as :
MambaDFuse
├── Datasets
│ ├── trainsets
│ │ ├── VIR
│ │ │ ├── VI_Y
│ │ │ ├── IR
│ │ ├── CT-MRI
│ │ │ ├── CT
│ │ │ ├── MRI
│ │ ├── PET-MRI
│ │ │ ├── PET_Y
│ │ │ ├── MRI
│ │ ├── SPECT-MRI
│ │ │ ├── SPECT_Y
│ │ │ ├── MRI
│ ├── valsets
│ │ ├── VIR
│ │ │ ├── VI_Y
│ │ │ ├── IR
│ │ ├── CT-MRI
│ │ │ ├── CT
│ │ │ ├── MRI
│ │ ├── PET-MRI
│ │ │ ├── PET_Y
│ │ │ ├── MRI
│ │ ├── SPECT-MRI
│ │ │ ├── SPECT_Y
│ │ │ ├── MRI
│ ├── testsets
│ │ ├── VIR
│ │ │ ├── VI_Y
│ │ │ ├── IR
│ │ ├── CT-MRI
│ │ │ ├── CT
│ │ │ ├── MRI
│ │ ├── PET-MRI
│ │ │ ├── PET_Y
│ │ │ ├── MRI
│ │ ├── SPECT-MRI
│ │ │ ├── SPECT_Y
│ │ │ ├── MRI
You may first modify the configuration file in the folder ./options/MambaDFuse
, such as gpu_ids
, path.root
, dataroot_A
, dataroot_B
, dataloader_batch_size
and so on.
python -m torch.distributed.launch --nproc_per_node=2 --master_port=1234 train_MambaDFuse.py --opt options/MambaDFuse/train_mambadfuse_vif.json --dist True
python -m torch.distributed.launch --nproc_per_node=2 --master_port=1234 train_MambaDFuse.py --opt options/MambaDFuse/train_mambadfuse_med.json --dist True
python test_MambaDFuse.py --model_path=./Model/Infrared_Visible_Fusion/Infrared_Visible_Fusion/models/ --iter_number=10000 --dataset=VIR --A_dir=IR --B_dir=VI_Y
python test_MambaDFuse.py --model_path=./Model/Medical_Fusion-SPECT-MRI/Medical_Fusion/models/ --iter_number=10000 --dataset=SPECT-MRI --A_dir=MRI --B_dir=SPECT_Y
The codes are heavily based on SwinFusion. And some inspiration for my ideas comes from Pan-Mamba. Thanks for their awesome works.