Easy way to use Mask R-CNN with ConvNeXt backbone.
This study allows the ConvNeXt architecture for the MaskRCNN model, available in the torchvision library, to be used as a backbone network. It also includes a customized trainer class. The study was also tested in one of the Cell Tracking Challenge datasets. The results of several different backbone network configurations were shared.
Install Pytorch>=1.8.0 and torchvision>=0.9.0 following official instructions or installation.
Follow these simple example steps to get a local copy up and to run.
- Clone the repo
git clone https://github.com/mberkay0/ConvNeXt-MaskRCNN.git
- Check if you have a virtual env
virtualenv --version
- If (not Installed)
pip install virtualenv
- Now create a virtual env in cd ConvNeXt-MaskRCNN/
virtualenv venv
- Then download a python modules
pip install -r requirements.txt
Provide helper functions to simplify writing torchvision pipelines using pre-trained models. Here is how you would do it.
import torch
from torchvision.models.detection import MaskRCNN
from .inference import Config
from .dataset import BuildDataset
from torch.utils.data import DataLoader
from .utils import get_file_dir, convnext_fpn_backbone, Trainer
train_dataset = BuildDataset(get_file_dir(train_img_path),
get_file_dir(train_mask_path))
train_loader = DataLoader(train_dataset, batch_size=Config.train_bs,
num_workers=4, shuffle=True, pin_memory=True,
drop_last=True, collate_fn=lambda x: tuple(zip(*x)))
backbone = convnext_fpn_backbone(
Config.backbone,
Config.trainable_layers
)
model = MaskRCNN(
backbone,
num_classes=Config.num_classes,
max_size=Config.max_size,
min_size=Config.min_size,
)
model.to(Config.device)
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.AdamW(
params,
lr=Config.lr,
weight_decay=Config.weight_decay
)
lr_scheduler = torch.optim.lr_scheduler.StepLR(
optimizer,
step_size=Config.step_size,
gamma=Config.gamma
)
scaler = torch.cuda.amp.GradScaler()
Train your MaskRCNN model with the ConvNeXt backbone architecture with the help of the Trainer class in an easy way.
trainer = Trainer(
optimizer=optimizer,
max_epochs=Config.epochs,
device=Config.device,
scaler=scaler,
verbose_num=Config.verbose_num,
split_size=Config.split_size,
val_bs=Config.val_bs
)
history = trainer.fit(
model,
train_dataloader=train_loader,
ckpt_path=Config.save_path + Config.model_name + ".pth"
)
You can find the sample code in the model.py.
Some results from the training using the ConvNeXt backbone network are shown below.
backbone name | resolution | dice score (%) | number of epoch |
---|---|---|---|
ResNet50 | 512x512 | 87.9 | 20 |
ConvNeXt-B | 512x512 | 92.0 | 20 |
ConvNeXt-T | 512x512 | 91.6 | 20 |