Skip to content

Latest commit

 

History

History
250 lines (228 loc) · 7.63 KB

README.md

File metadata and controls

250 lines (228 loc) · 7.63 KB

Building Extraction using YOLO based Instance Segmentation

Open In Colab

This code is part of our solution for 2024 IEEE BigData Cup: Building Extraction Generalization Challenge (IEEE BEGC2024). Specifically, this repository provides the code to extract additional building footprint data from the Microsoft Building Footprint (BF) dataset for Redmond, Washington, and Las Vegas, Nevada. We use the extracted dataset to train our YOLOv8-based instance segmentation model, along with the training set provided by the IEEE BEGC2024 dataset. Results show that YOLOv8 trained on BEGC2024 with the additional dataset achieves a significant F1-score improvement compared to training on the BEGC2024 training set alone. Our approach ranked 1st globally in the IEEE Big Data Cup 2024 - BEGC2024 challenge! 🏅🎉🥳

Instructions

Conda environment

conda create --name yolo python=3.10.12 -y
conda activate yolo

Clone this repo

# clone this repo
git clone https://github.com/yjwong1999/RSBuildingExtraction.git
cd RSBuildingExtraction

Install dependencies

# Please adjust the torch version accordingly depending on your OS
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121

# Install Jupyter Notebook
pip install jupyter notebook==7.1.0

# Remaining dependencies (for instance segmentation)
pip install ultralytics==8.1
pip install pycocotools
pip install requests==2.32.3
pip install click==8.1.7
pip install opendatasets==0.1.22

Data structure

Since we uses YOLO as our segmentation model, we have to organize our dataset following the YOLO format. The setup_data.py code will automatically take the raw data from Kaggle and convert it into YOLO format. The mydata directory will store the training data for our YOLO model. We also put the additional dataset (i.e. Microsoft Building Footprint Dataset, diffusion augmentation) into mydata.

RSBuildingExtraction/mydata
├── train
│   └── images  
│   └── labels  
├── valid
│   └── images  
│   └── labels  

Results

Training with Different Instance Segmentation Model

Model Pretrained Weights Batch Size Params (M) FLOPs (G) Public F1-Score
Conf = 0.50 Conf = 0.20
YOLOv8n-seg DOTAv1 Aerial Detection 16 3.4 12.6 0.510 0.645
YOLOv8s-seg 16 11.8 42.6 0.535 0.654
YOLOv8m-seg 16 27.3 110.2 0.592 0.649
YOLOv8x-seg 8 71.8 344.1 0.579 0.627
YOLOv9c-seg COCO Segmentation 4 27.9 159.4 0.476 0.577
Mask R-CNN (MPViT-Tiny) COCO Segmentation 4 17 196.0 - 0.596
EfficientNet-b0-YOLO-seg ImageNet 4 6.4 12.5 - 0.560

Training with Different Dataset

Solution FLOPS (G) F1-Score
Public Private
YOLOv8m-seg + BEGC 2024 110.2 0.64926 0.66531
YOLOv8m-seg + BEGC 2024 + Redmond Dataset 0.65951 0.67133
YOLOv8m-seg + BEGC 2024 + Las Vegas Dataset 0.68627 0.70326
YOLOv8m-seg + BEGC 2024 + Diffusion Augmentation 0.67189 0.68096
2nd place (RTMDet-x + Alabama Buildings Segmentation Dataset) 141.7 0.6813 0.68453
3rd Place (Custom Mask-RCNN + No extra Dataset) 124.1 0.59314 0.60649
  • We extract our "Redmond dataset" and "Las Vegas dataset" from the Microsoft Building Footprint dataset (please refer the details from our paper). Meanwhile, please refer our segmentation-guided diffusion model to see how we implement our diffusion augmentation pipeline.
  • Note that the 2nd-place solution uses a bigger model (higher FLOPs) with an additional dataset to reach a high F1 score, whereas our diffusion augmentation pipeline allows our model (lower FLOPs) to achieve a surprisingly close F1 score without an additional dataset.

Inference with Different NMS IoU Threshold

Dataset Private F1 Score
(using different NMS IoU Threshold)
0.70 0.75 0.80 0.85 0.90 0.95
BEGC2024 + Redmond Dataset 0.672 0.677 - - 0.748 0.866
BEGC2024 + Las Vegas Dataset 0.703 0.693 0.686 0.721 0.766 0.897
BEGC2024 + Diffusion Augmentation 0.681 - 0.694 0.711 0.751 0.887

Acknowledgement

We thank the following works for the inspiration of our repo!

  1. 2024 IEEE BigData Cup: Building Extraction Generalization Challenge link
  2. Ultralytic YOLO code
  3. MPViT-based Mask RCNN code
  4. COCO2YOLO format original code, modified code

Cite this repository

Our paper has been accepted by IEEE BigData 2024! Please cite our paper if this repo helps your research. The preprint is available here

@InProceedings{Wong2024,
title = {Cross-City Building Instance Segmentation: From More Data to Diffusion-Augmentation},
author = {Yi Jie Wong and Yin-Loon Khor and Mau-Luen Tham and Ban-Hoe Kwan and Anissa Mokraoui and Yoong Choon Chang},
booktitle={2024 IEEE International Conference on Big Data (Big Data)},
year={2024}}