Skip to content

Machine learning model building off previous semester project. I will be rewritng code to put it in a form that works on more os and systems.

License

Notifications You must be signed in to change notification settings

Chan-man00/fogeye

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FogEye - Computational DeFogging via Image-to-Image Translation on a real-world Dataset Visitors

Github io page link | |

The FogEye dataset is available here: OneDrive

Graphical Abstract

Graphical abstract

Overview of the FogEye project. a): A diagram summarizing the work done in this work. b): Example results obtained by applying the pix2pix framework to the FogEye dataset. Our approach works for a range of fog densities.

News

nothing to show here


FogEye logo

logo image attributions: U of U | DAAD

This repository documents a research project carried out at the Laboratory for Optical Nanotechnologies at the University of Utah under supervision of Prof. Rajesh Menon in Spring (January-April) 2024. It was funded by [Prof. Rajesh Menon].

real image foggy image reconstructed image

Adafruit Feather 32u4 Radio board

Looping through the epochs of a trained model

Table of Contents

Goal

This project has four objectives:

  1. add hdr to existing cameras
  2. collect a dataset of paired images and add-on to existing data set
  3. apply the pix2pix model developed at the University of California, Berkeley to the translation problem fog → no fog
  4. compare goodness of fit for metrics

Potential applications

  • Autonomous driving
  • Search & rescue (wildfires, home fires, etc.)
  • Military

Project timeline

The project was initially carried out over the course three months, from July to September 2023. Continuing contributions have occurred from January to April 2024.

Image capturing device

click to expand

Requirements

The device had to be able to:

  • accomodate two cameras
  • isolate the cameras from each other
  • provide a fog chamber for one of the cameras
  • trigger both cameras at the same time

The shift in perspective hailing from the distance the two cameras are set apart will be ignored. The further away the photographed scenes are, the less this will have an effect on the resulting images.

Cameras

The two identical cameras used for this project had to be:

  • programmable
  • able to interface with other devices
  • small & lightweight
  • low power

Therefore, we chose to use the OpenMV H7 cameras for the task. The OpenMV IDE makes it easy to program the camera using python. They are able to receive input from their I/O pins as well as output user feedback using their LEDs.

OpenMV H7 camera

OpenMV H7 camera

** Note: OpenMV IDE License must be purchased at License for $15 **

HDR

To improve performance of our Machine Learning model we implemented HDR processing for our training image datasets. By merging multiples of the same image taken at different exposures HDR provides greater contrast and detail, reduced image artifacts, and a wider range of luminance improving the information provided to the algorithm. The HDR is implemented in two stages. Four images of the same scene are recorded by the camera at different exposures ranging around what the auto exposure would be. Those images are then merged in post processing on a separate computer. Potentially, both steps of the process could be handled by moving to a raspberry pi.

Image trigger

In order to get exactly paired images from both cameras that are captured at the same time, it is necessary to introduce a common trigger. Prior stewards of this project used a lightweight Arduino board for this task. Any Arduino board should be capable of sending this trigger, but an Adafruit Feather 32u4 Radio was used that was available from an earlier project. The board was connected to both cameras and sends a trigger signal to both cameras at the same time. The cameras are programmed to capture an image when they receive the trigger signal. read_external_trigger

For more on the arduino setup go here

Adafruit Feather 32u4 Radio board

Adafruit Feather 32u4 Radio board

Currently, the camera is set up for HDR 4-exposure bursts push_button_trigger and the wiring has been changed over to a push button. The push button switch is run between P0 and ground.

Camera Setup Back

Camera Setup Back

Gimbal

Gimbal model used in this project

Gimbal model used in this project

In order to stabilize the images while walking and ensure they are approximately level, a gimbal was used to hold the entire device. The gimbal used for this project was the Hohem iSteady Q. It is a lightweight single-axis gimbal that is able to hold a smartphone.

** For the HDR imaging the gimble was deemed insufficient, a tripod was used instead **

Case

In order to be able to capture approximately the same image, the cameras had to be mounted as close together as possible. Simultaneously, the case must be able to hold the fog surrounding one camera while isolating the other camera from the influence of the fog, keeping all other conditions the same. Therefore, both cameras are arranged side by side, inside separate chambers.

The case was designed in Fusion 360. Some 3D printed files were printed using a Creality Ender 3 Pro, some on an Ultimaker S3. The front plate was lasercut on a CO2 laser cutter.

CAD design of the gimbal mount

CAD design of the gimbal mount

CAD design of the gimbal bridge

CAD design of the gimbal bridge

Front view of entire CAD design

Front view of entire CAD design

Rear view of entire CAD design

Rear view of entire CAD design

Handheld fogger

** Various portable foggers are available, but ordering and shipping takes time. **

Bill of Materials (BoM)

The following components are required for the device:

Purchased Parts

*Initial purchase

  • 2x OpenMV H7 camera
  • 1x Adafruit Feather board (or any other microcontroller capable of this task, i.e., any other microcontroller)
  • 1x Hohem iSteady Q gimbal
  • 2x Toggle switch (any latching switch that can be used to trigger the cameras)
  • 1x Breadboard 30x70mm
  • 2x Rubber stoppers
  • External USB Power Supply

*Secondary purchase

  • 1x Push Button Switch
  • Various connecting wires
  • USB power adapter

Manufactured parts

  • Back box
  • Front plate
  • Front camera screw terminal
  • Gimbal mount
  • Gimbal bridge
  • Hinge
  • Lock body
  • Lock catch
  • Lock receptor
  • Maintenance door with hinge
  • Maintenance door brace
  • Rear camera standoff
  • Top plate
  • Wire restraint

CAD file attributions

Several parts of the CAD model were adopted from different sources. They are attributed in the following:

Part Source License
OpenMV camera GrabCAD unknown
Adafruit Feather board Adafruit MIT
Prototyping board GrabCAD unknown
Toggle switch GrabCAD unknown
DIN912 M3 25mm screw 3Dfindit unknown

Model Training

The models were either trained on on a personal laptop computer equipped with 64 GB of RAM, a lab computer equipped with a dedicated GPU (NVIDIA GeForce GTX 970), and 64 GB of RAM or on the University of Utah's Center for High Performance Computing (CHPC) cluster. All models were trained for the default 200 epochs for the pix2pix model. The training time increased along with the size of the dataset. For the final model, the training time was around 20 hours.

Getting started

descriptions on how to get up and running

click to expand

1. Cloning the repository

Clone the repository using git:

git clone https://github.com/apoll2000/FogEye.git

Navigate into the repository:

cd FogEye

2. Installing a Python environment

Next, an appropriate Python environment needs to be created. All code was run on Python 3.9.7. For creating the environment, either conda or pyenv virtualenv can be used.


The environment can be created using conda with:

conda create --name FogEye python=3.9.7

Or using pyenv virtualenv with:

pyenv virtualenv 3.9.7 FogEye

Then activate the environment with:

conda activate FogEye

Or:

pyenv activate FogEye

Using pip, the required packages can then be installed. (for conda environments, execute

conda install pip

before to install pip). The packages are listed in the requirements.txt and can be installed with:

pip install -r requirements.txt

In case you want to install them manually, the packages include:

  • numpy
  • torch
  • opencv-python
  • matplotlib
  • ...

It is important that you specify the right torch version if you would like to use your CUDA-enabled GPU to train the model, which will drastically reduce training time. See the PyTorch website for more information.

3. Downloading the dataset

The dataset is currently being hosted here: TUBCloud. Depending on the further development of the project, this might not be the final storing location.

Place the FogEye_images folder into the datasets folder of the repository:

-- datasets
    |-- FogEye_images
        |-- 2023-08-03-04
            |-- A
                |-- 01-04_08_23__1.bmp
                |-- 01-04_08_23__2.bmp
                |-- ...
            |-- B
                |-- 01-04_08_23__1.bmp
                |-- 01-04_08_23__2.bmp
                |-- ...
        |-- ...

4. Preparing the dataset

The dataset needs to be prepared for training. This includes transforming the folder structure into one compatible with the pix2pix framework and splitting the dataset into training, validation and testing sets. It can be performed using the following command:

python preprocess_FogEye_dataset.py --dataroot path/to/dataset

5. Training a model

The model training can be started using the following command:

python train.py --dataroot path/to/dataset --name name_of_model --model pix2pix --direction BtoA --gpu_ids 0

6. Testing a model

python test.py --dataroot path/to/dataset --direction BtoA --model pix2pix --name name_of_model

Ample information on the training and testing process and their parameters can be found on the pix2pix GitHub page.

7. Helper scripts

This GitHub page includes several helper scripts to perform different actions like hyperparameter tuning or epoch visualization.

These are: Preprocessing:

  • preprocess_FogEye_dataset.py Hyperparameter tuning:
  • hyperparameter_dropoutRate.py
  • hyperparameter_GAN.py
  • hyperparameter_init_type.py
  • hyperparameter_lr_policy.py
  • hyperparameter_n_layers_D.py
  • hyperparameter_netD.py
  • hyperparameter_netG.py
  • hyperparameter_ngf_ndf.py
  • hyperparameter_normalization.py
  • hyperparameter_Res9AndMore.py
  • hyperparameter_supertraining.py Visualization:
  • plot_model_results.py
  • evaluate_model_group.py

Synthetic data

click to expand

At the beginning of the project, we experimented with synthetic datasets in combination with the pix2pix model. The datasets used were based on the Cityscapes dataset as well as on images derived from the CARLA simulator. The fog simulations generally work either by directly using a depth map that is available for each particular image, or by using the left and right images to calculate the depths in the images, thus reconstructing this depth map. This depth map helps in estimating how strongly the fog affects different parts of the image.

Semi-synthetic datasets

The datasets in the following are semi-synthetic, meaning that they work with real images, to which the fog has been added synthetically. A disadvantage of this method is that the depth map is never perfect, which can lead to artifacts in the fogged images.

Foggy Cityscapes from Uni Tübingen

In cooperation with the researchers Georg Volk and Jörg Gamerdinger from the University of Tübingen, Germany, we trained a model on synthetic data generated for their paper "Simulating Photo-realistic Snow and Fog on Existing Images for Enhanced CNN Training and Evaluation".

Foggy Cityscapes from ETH Zürich

Another dataset taken into consideration was the Foggy Cityscapes dataset from the paper "Semantic Foggy Scene Understanding with Synthetic Data" by Sakaridis et al.. The dataset was created by the Computer Vision Lab of ETH Zürich, Switzerland.

Fully synthetic datasets

The following dataset was created entirely synthetically. The original images were rendered using a driving simulator, which generated the matching perfect depth maps as well. This way, the fogged images do not show any artifacts.

Foggy CARLA from Uni Tübingen

This dataset was created by the researchers Georg Volk and Jörg Gamerdinger from the University of Tübingen, Germany, using the same technique from the paper "Simulating Photo-realistic Snow and Fog on Existing Images for Enhanced CNN Training and Evaluation". It is based on the CARLA simulator.

Collected dataset

description & details of the collected dataset

Originally, approximately 10,000 images of QVGA(240x320 pixels) quality were collected in the RGB565(5 bits for Red and Blue, 6 bits for Green) format. For the implementation of HDR, 244 images were collected in the original format and combined into 61 HDR images, and 2192 images were collected in VGA(480x640 pixels) quality in the raw sensor data format.
  • ~10.3k QVGA(240x320) RGB 565 images have been collected
  • 2192 raw data images were collected
  • 2436 images were collect for compression into 609 HDR images

pix2pix on dataset

While HDR processing reduced the total number of images available for training, collection of raw data nullified this reduction through the use of differing levels of gamma correction. By varying the gamma different details are revealed or accentuated.

*ML results on dataset*

Limitations

An initial limitaion to the first iteration of the ML model was the homogeneity of the collected data. While the initial data set was larger than the one collected by the second group of participants on this project, it was collected in a small geographic area and under fairly uniform Utah summer conditions. The second dataset expands diversity of setting and weather significantly.

A second limitation was with the cameras themselves. It is only recently that the cameras have been set to capture data in the raw format, and only at the VGA resolution. All other resolution levels are locked into the RGB565 format. While training at lower resolution for proof of concept is advantageous due to the lower computational time, flexibility in selecting resolution and format is desireable. Additionally, the ability to do the HDR processing in conjunction with image capture would have sped up the process. A platform like the raspberry pi 5 with its capacity to simultaneously operate 2 cameras and then process images in the desired format would have simplified the operation. And at about the same price point as the OpenMV cameras.

Licensing

Code

The code is licensed under the BSD 3-Clause License, available under CODE_LICENSE. -> this is taken from pyramid pix2pix

The parts of the code that were adopted from the pix2pix project are licensed under ... MAKE SURE NOT TO VIOLATE PIX2PIX BSD LICENSE HERE

Dataset

The dataset is licensed under the Creative Commons Attribution 4.0 International License, available under DATASET_LICENSE.

-> or should this be CC-BY-NC (non-commercial?)

Hardware

The hardware is licensed under the CERN Open HArdware License v2 - Weakly Reciprocal (CERN-OHL-W v2), available under HARDWARE_LICENSE.

Citation

If you use the dataset or any of the code in this repository created by us, please cite the following paper:

@misc{Welch2024FogEye,
      title={FogEye -- Computational DeFogging via Image-to-Image Translation on a real-world Dataset}, 
      author={Chandler Welch},
      year={2024}
}

Add our paper here

References

Appendix

click to expand

Fog Decay Study

We conducted a study on how quickly the fog decays in order to know better how often it needs to be replenished. This was done by filling the fog chamber, letting the fog decay and filming the entire decay using both of the cameras. The resulting video of the fogged camera was analyzed by calculating the Variance of the Laplacian of each frame as a metric for the intensity of the fog. You can see that after about 5 minutes, the fog intensity becomes quite low.

Fog decay

Fog decay measurement over time

Best Performing Metric

We also conducted a study on which metric performed the best over the Variance of the Laplacian, using the coefficient of determination, R^2, to determine goodness of fit. R^2 describes a proportion of variance in the dependent variable or evaluator (y axis) that can be explained by the independent variable or Variance pf the Laplacian (x axis).

R^2

Pearson vs LaPlace- Robust fit: None, Center and Scale.

R^2

MSE vs LaPlace- Robust fit: None, Center and Scale.

R^2

SSIM vs LaPlace- Robust fit: Bisquare, Center and Scale.

R^2

CW-SSIM vs LaPlace- Robust fit: Bisquare, Center and Scale.

R^2

NCC vs LaPlace- Robust fit: Bisquare, Center and Scale.

HDR

This code defines the High Dynamic Range environment.

''' requirements for this code** pip install opencv-python pip install numpy pip install OpenEXR pip install Imath ''' import os import cv2 import numpy as np import shutil import rawpy import imageio import datetime

#dataroot = './pre_hdr/3_31_24_fog' #output_dataroot = './post_hdr/hdr_images/fog'

def make_dir_hdr(dataroot): ''' this function deals with .raw to .bmp conversion and makes individual directories for each set of 4 images to be processed HDR this is a helper script for ordinizational processes it into also helps with debugging and keeping track of the images as you can easily see which images are in which set and adjust offsets accordingly.''' __temp = (len(os.listdir(dataroot)))//4 for i in range(__temp): os.makedirs(f'{dataroot}/{i}' , exist_ok=True)

for file in os.listdir(dataroot): 
    # if files are in raw format they are converted to bmp
    if file.endswith('.raw'): 
        raw = rawpy.imread(f'{dataroot}/{file}')
        temp = file.strip('.raw')
        temp = int(temp)
        temp = (temp-1)//4
        rgb = raw.postprocess()
        new_path = f'{dataroot}/{temp}/{file.strip('.raw')}.bmp'
        imageio.imsave(new_path, rgb)
        
    if file.endswith('.bmp'): # for if files are already in bmp format
        temp = file.strip('.bmp')
        temp = int(temp)
        temp = (temp-1)//4
        shutil.move(f'{dataroot}/{file}', f'{dataroot}/{temp}/{file}') 

print(datetime.date.today())

Path to the folder containing the images to be processed into HDR

#dataroot = "./pre_hdr/3_31_24_fog" #out_dataroot = "./post_hdr/hdr_images"

def create_hdr_images(dataroot, out_dataroot='./post_hdr/hdr_images', HDR_type='Debevec'): ''' This function creates HDR images from the images in the dataroot folder ''' exposure = np.array([0.5, 0.75, 1.5, 2], dtype=np.float32) _temp = [] for file in os.listdir(dataroot): if file.endswith('.txt'): _temp.append(file) if not len(_temp) == 1: raise AssertionError("\nThere should be only one .txt file in the dataroot folder.\nThis .txt file should contain the exposure times for the images in the folder.\nif you dont have the exposure times consider changing the HDR_type to 'Mertens' or 'Robertson'.\n")

with open(f'{dataroot}/{_temp[0]}', 'r') as f:
    exposure_ts = f.readlines()


if not os.path.exists(out_dataroot):
    os.makedirs(out_dataroot)
# Assuming each image set to be processed into HDR is in a separate subfolder
for subdir in os.listdir(dataroot):
    subdir_path = os.path.join(dataroot, subdir)
    if os.path.isdir(subdir_path):
        #print(subdir_path.split('\\')[-1])
        temp_idx = int(subdir_path.split('\\')[-1])
        #assert True == False
        images = []
        for filename in sorted(os.listdir(subdir_path), key=lambda x: int(x.split('.')[0])):
            file_path = os.path.join(subdir_path, filename)
            im = cv2.imread(file_path, cv2.IMREAD_ANYDEPTH | cv2.IMREAD_COLOR)  # Adjust if your images are not standard 8-bit or 16-bit images
            if im is not None:
                images.append(im)

        if len(images) > 0:
            # Align input images
            alignMTB = cv2.createAlignMTB()
            alignMTB.process(images, images)
            # Obtain Camera Response Function (CRF)
            exposure_times = exposure*float(exposure_ts[temp_idx])*0.000001

            # 1 - MergeDebevec
            if HDR_type == 'Debevec':
                if not os.path.exists(f"{out_dataroot}/vga"):
                    os.makedirs(f"{out_dataroot}/vga")
                    os.makedirs(f"{out_dataroot}/vga/22")
                    os.makedirs(f"{out_dataroot}/vga/44")
                    os.makedirs(f"{out_dataroot}/vga/10")

                # Note: HDR images have a high dynamic range that cannot be properly displayed on standard monitors
                # without tone mapping. Here, we'll just visualize a tonemapped version for simplicity.
                calibrateDebevec = cv2.createCalibrateDebevec()
                responseDebevec = calibrateDebevec.process(images, exposure_times)
                mergeDebevec = cv2.createMergeDebevec()
                hdrDebevec = mergeDebevec.process(images, exposure_times, responseDebevec)
                hdr_filename = os.path.join(out_dataroot, f"vga/vga_hdr_{datetime.date.today()}_{subdir}.hdr")
                cv2.imwrite(hdr_filename, hdrDebevec.copy())
                # Save your HDR tonemapped images at 3 different gamma values 
                hdr_filename = os.path.join(out_dataroot, f"vga/22/vga_hdr_{datetime.date.today()}_{subdir}_22.bmp")
                tonemapped = cv2.createTonemap(2.2).process(hdrDebevec.copy())
                tonemapped = np.clip(tonemapped*255, 0, 255).astype('uint8')
                cv2.imwrite(hdr_filename, tonemapped)
                if subdir in '1':
                    cv2.imshow('HDR Image', tonemapped)
                    cv2.waitKey()

                tonemapped = cv2.createTonemap(4.4).process(hdrDebevec.copy())
                tonemapped = np.clip(tonemapped*255, 0, 255).astype('uint8')
                hdr_filename = os.path.join(f"{out_dataroot}", f"vga/44/vga_hdr_{datetime.date.today()}_{subdir}_44.bmp")
                cv2.imwrite(hdr_filename, tonemapped)
                tonemapped = cv2.createTonemap(1.0).process(hdrDebevec.copy())
                tonemapped = np.clip(tonemapped*255, 0, 255).astype('uint8')
                hdr_filename = os.path.join(out_dataroot, f"vga/10/vga_hdr_{datetime.date.today()}_{subdir}_10.bmp")
                cv2.imwrite(hdr_filename, tonemapped)


            # 2 - MergeRobertson has not been tested    ###################
            if HDR_type == 'Robertson':
                mergeRobertson = cv2.createMergeRobertson()
                hdrRobertson = mergeRobertson.process(images, exposure_times)
                hdr_filename = os.path.join(out_dataroot, f"hdr_{datetime.date.strftime("%Y-%m-%d")}_{subdir}.bmp")
                hdrRobertson = cv2.createTonemap(2.2).process(hdrRobertson.copy())
                #hdrRobertson = cv2.normalize(hdrRobertson, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)
                cv2.imwrite(hdr_filename, hdrRobertson.copy())


            # 3 - MergeMertens Does not require exposure times
            if HDR_type == 'Mertens':
                mergeMertens = cv2.createMergeMertens()
                hdrMertens = mergeMertens.process(images)
                res_16bit = np.clip(hdrMertens*255, 0, 255).astype('uint8')
                hdr_filename = os.path.join(out_dataroot, f"hdr_{datetime.date.strftime("%Y-%m-%d")}_{subdir}.bmp")
                cv2.imwrite(hdr_filename, res_16bit)
                # Display the HDR image

        print(f"Saved HDR image to {hdr_filename}")
    cv2.destroyAllWindows()

if name == 'main': parser = argparse.ArgumentParser() parser.add_argument('--dataroot', type=str, required=True, help='The root directory of the dataset') parser.add_argument('--path_no_fog', type=str, required=True, help='The path to the images without fog') args = parser.parse_args() make_dir_hdr(args.dataroot) create_hdr_images(args.dataroot, out_dataroot=args.path_no_fog, HDR_type='Debevec')

About

Machine learning model building off previous semester project. I will be rewritng code to put it in a form that works on more os and systems.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •