Skip to content

Implementing Segformer (A ViT for segmentation tasks) for CityScape Segmentation

Notifications You must be signed in to change notification settings

CoNfIg7952/Visual_transformer_for_segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Cityscape Segmentation using Segformer Architecture

Project Overview

This project implements semantic segmentation on the Cityscape dataset using the Segformer architecture. The model is trained to segment urban scenes into various classes, including roads, buildings, cars, pedestrians, and 19 such classes in total.

Table of Contents

Introduction

Semantic segmentation is a pixel-level classification task aimed at assigning a class label to every pixel in an image. This project uses the Segformer architecture to perform segmentation on the Cityscape dataset, which contains images of urban environments.

Dataset

At the time of this project a total of 19 classes were considered, There were 2380 train images, 595 validation images, 500 test Images For more information on the dataset, visit the official Cityscape Dataset.

Architecture

Segformer is a transformer-based model designed for semantic segmentation tasks. It leverages both local and global feature representations, providing a robust and scalable solution for vision tasks. This project uses the Segformer implementation from scratch, the model was built using the pytorch modular approach. image

For more details, refer to the original author of this architecture here.

About

Implementing Segformer (A ViT for segmentation tasks) for CityScape Segmentation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages