Building Extraction using YOLO based Instance Segmentation

By Yi Jie WONG & Yin-Loon Khor et al

This code is part of our solution for 2024 IEEE BigData Cup: Building Extraction Generalization Challenge (IEEE BEGC2024). Specifically, this repository provides the code to extract additional building footprint data from the Microsoft Building Footprint (BF) dataset for Redmond, Washington, and Las Vegas, Nevada. We use the extracted dataset to train our YOLOv8-based instance segmentation model, along with the training set provided by the IEEE BEGC2024 dataset. Results show that YOLOv8 trained on BEGC2024 with the additional dataset achieves a significant F1-score improvement compared to training on the BEGC2024 training set alone. Our approach ranked 1st globally in the IEEE Big Data Cup 2024 - BEGC2024 challenge! 🏅🎉🥳

Instructions

Conda environment

conda create --name yolo python=3.10.12 -y
conda activate yolo

Clone this repo

# clone this repo
git clone https://github.com/yjwong1999/RSBuildingExtraction.git
cd RSBuildingExtraction

Install dependencies

# Please adjust the torch version accordingly depending on your OS
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121

# Install Jupyter Notebook
pip install jupyter notebook==7.1.0

# Remaining dependencies (for instance segmentation)
pip install ultralytics==8.1
pip install pycocotools
pip install requests==2.32.3
pip install click==8.1.7
pip install opendatasets==0.1.22

Data structure

Since we uses YOLO as our segmentation model, we have to organize our dataset following the YOLO format. The setup_data.py code will automatically take the raw data from Kaggle and convert it into YOLO format. The mydata directory will store the training data for our YOLO model. We also put the additional dataset (i.e. Microsoft Building Footprint Dataset, diffusion augmentation) into mydata.

RSBuildingExtraction/mydata
├── train
│   └── images  
│   └── labels  
├── valid
│   └── images  
│   └── labels

Results

Training with Different Instance Segmentation Model

Model	Pretrained Weights	Batch Size	Params (M)	FLOPs (G)	Public F1-Score
Model	Pretrained Weights	Batch Size	Params (M)	FLOPs (G)	Conf = 0.50	Conf = 0.20
YOLOv8n-seg	DOTAv1 Aerial Detection	16	3.4	12.6	0.510	0.645
YOLOv8s-seg		16	11.8	42.6	0.535	0.654
YOLOv8m-seg		16	27.3	110.2	0.592	0.649
YOLOv8x-seg		8	71.8	344.1	0.579	0.627
YOLOv9c-seg	COCO Segmentation	4	27.9	159.4	0.476	0.577
Mask R-CNN (MPViT-Tiny)	COCO Segmentation	4	17	196.0	-	0.596
EfficientNet-b0-YOLO-seg	ImageNet	4	6.4	12.5	-	0.560

Training with Different Dataset

Solution	FLOPS (G)	F1-Score
Solution	FLOPS (G)	Public	Private
YOLOv8m-seg + BEGC 2024	110.2	0.64926	0.66531
YOLOv8m-seg + BEGC 2024 + Redmond Dataset		0.65951	0.67133
YOLOv8m-seg + BEGC 2024 + Las Vegas Dataset		0.68627	0.70326
YOLOv8m-seg + BEGC 2024 + Diffusion Augmentation		0.67189	0.68096
2nd place (RTMDet-x + Alabama Buildings Segmentation Dataset)	141.7	0.6813	0.68453
3rd Place (Custom Mask-RCNN + No extra Dataset)	124.1	0.59314	0.60649

We extract our "Redmond dataset" and "Las Vegas dataset" from the Microsoft Building Footprint dataset (please refer the details from our paper). Meanwhile, please refer our segmentation-guided diffusion model to see how we implement our diffusion augmentation pipeline.
Note that the 2nd-place solution uses a bigger model (higher FLOPs) with an additional dataset to reach a high F1 score, whereas our diffusion augmentation pipeline allows our model (lower FLOPs) to achieve a surprisingly close F1 score without an additional dataset.

Inference with Different NMS IoU Threshold

Dataset	Private F1 Score (using different NMS IoU Threshold)
Dataset	0.70	0.75	0.80	0.85	0.90	0.95
BEGC2024 + Redmond Dataset	0.672	0.677	-	-	0.748	0.866
BEGC2024 + Las Vegas Dataset	0.703	0.693	0.686	0.721	0.766	0.897
BEGC2024 + Diffusion Augmentation	0.681	-	0.694	0.711	0.751	0.887

Acknowledgement

We thank the following works for the inspiration of our repo!

2024 IEEE BigData Cup: Building Extraction Generalization Challenge link
Ultralytic YOLO code
MPViT-based Mask RCNN code
COCO2YOLO format original code, modified code

Cite this repository

Our paper has been accepted by IEEE BigData 2024! Please cite our paper if this repo helps your research. The preprint is available here

@InProceedings{Wong2024,
title = {Cross-City Building Instance Segmentation: From More Data to Diffusion-Augmentation},
author = {Yi Jie Wong and Yin-Loon Khor and Mau-Luen Tham and Ban-Hoe Kwan and Anissa Mokraoui and Yoong Choon Chang},
booktitle={2024 IEEE International Conference on Big Data (Big Data)},
year={2024}}

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
coco2yolo		coco2yolo
BEGC2024_YOLO_based_Building_Segmentation.ipynb		BEGC2024_YOLO_based_Building_Segmentation.ipynb
README.md		README.md
setup_data.py		setup_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building Extraction using YOLO based Instance Segmentation

By Yi Jie WONG & Yin-Loon Khor et al

Instructions

Data structure

Results

Training with Different Instance Segmentation Model

Training with Different Dataset

Inference with Different NMS IoU Threshold

Acknowledgement

Cite this repository

About

Releases

Packages

Languages

yjwong1999/RSBuildingExtraction

Folders and files

Latest commit

History

Repository files navigation

Building Extraction using YOLO based Instance Segmentation

By Yi Jie WONG & Yin-Loon Khor et al

Instructions

Data structure

Results

Training with Different Instance Segmentation Model

Training with Different Dataset

Inference with Different NMS IoU Threshold

Acknowledgement

Cite this repository

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages