Single Image Dehazing with Convolutional Vision Transformer (CVT)

Description

This project implements a Single Image Dehazing model using a Convolutional Vision Transformer (CVT). The model is designed to remove haze from images, thereby enhancing the clarity and quality of the visual content. The CVT model leverages convolutional layers for feature extraction and self-attention mechanisms to capture long-range dependencies in the image data. The project includes training and testing scripts, as well as utilities for data loading, augmentation, and metric calculation (PSNR and SSIM) to evaluate model performance.

Introduction

Image dehazing is a crucial preprocessing step for various computer vision applications, especially those operating in outdoor environments where haze can significantly degrade image quality. This project aims to provide an effective and efficient solution to the image dehazing problem using state-of-the-art transformer-based techniques.

Model Architecture

The model architecture consists of a Convolutional Vision Transformer (CVT) which combines the strengths of convolutional neural networks (CNNs) and transformer networks. The model includes:

Convolutional Layers for initial feature extraction.
Self-Attention Mechanisms for capturing long-range dependencies.
Feedforward Networks for further processing of features.
Upsampling Layers to reconstruct the dehazed image.

Dataset Preparation

Dataset: RESIDE-6K To train the dehazing model, you need a dataset containing pairs of hazy and clear images. The dataset should be organized in the following structure:

data/
├── train/
│   ├── hazy/
│   └── GT/
├── valid/
│   ├── hazy/
│   └── GT/
└── test/
    ├── hazy/
    └── GT/

hazy/: Directory containing hazy images.
GT/: Directory containing ground truth clear images.

Results

The model's performance is evaluated using Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM). The metrics for each epoch are recorded and saved for further analysis.

Hazy Image	Dehazed Image (with 10 epoch)

Sample Metrics Table

Epoch	Loss	PSNR	SSIM
1	0.49	3.50	0.0018
...	...	...	...
10	0.0199	19.11	0.62

Dependencies

Python 3.8 or higher
PyTorch
torchvision
numpy
scikit-image
opencv-python
pandas

Install the required packages using:

conda install --file requirements.txt  -c pytorch -c nvidia

License

This project is licensed under the MIT License. See the LICENSE file for details.

Feel free to modify the sections and content as needed to better fit your project's specifics.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
CVTDehazer.ipynb		CVTDehazer.ipynb
README.md		README.md
dehazed_image.jpg		dehazed_image.jpg
hazy_image.jpg		hazy_image.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Single Image Dehazing with Convolutional Vision Transformer (CVT)

Description

Table of Contents

Introduction

Model Architecture

Dataset Preparation

Results

Sample Metrics Table

Dependencies

License

About

Releases

Packages

Languages

mrdjango/CVT-Dehazer

Folders and files

Latest commit

History

Repository files navigation

Single Image Dehazing with Convolutional Vision Transformer (CVT)

Description

Table of Contents

Introduction

Model Architecture

Dataset Preparation

Results

Sample Metrics Table

Dependencies

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages