The FogEye dataset is available here: OneDrive
Overview of the FogEye project. a): A diagram summarizing the work done in this work. b): Example results obtained by applying the pix2pix framework to the FogEye dataset. Our approach works for a range of fog densities.nothing to show here
logo image attributions: U of U | DAAD
This repository documents a research project carried out at the Laboratory for Optical Nanotechnologies at the University of Utah under supervision of Prof. Rajesh Menon in Spring (January-April) 2024. It was funded by [Prof. Rajesh Menon].
real image | foggy image | reconstructed image |
---|---|---|
- FogEye
This project has four objectives:
- add hdr to existing cameras
- collect a dataset of paired images and add-on to existing data set
- apply the pix2pix model developed at the University of California, Berkeley to the translation problem fog → no fog
- compare goodness of fit for metrics
- Autonomous driving
- Search & rescue (wildfires, home fires, etc.)
- Military
The project was initially carried out over the course three months, from July to September 2023. Continuing contributions have occurred from January to April 2024.
click to expand
The device had to be able to:
- accomodate two cameras
- isolate the cameras from each other
- provide a fog chamber for one of the cameras
- trigger both cameras at the same time
The shift in perspective hailing from the distance the two cameras are set apart will be ignored. The further away the photographed scenes are, the less this will have an effect on the resulting images.
The two identical cameras used for this project had to be:
- programmable
- able to interface with other devices
- small & lightweight
- low power
Therefore, we chose to use the OpenMV H7 cameras for the task. The OpenMV IDE makes it easy to program the camera using python
. They are able to receive input from their I/O pins as well as output user feedback using their LEDs.
** Note: OpenMV IDE License must be purchased at License for $15 **
To improve performance of our Machine Learning model we implemented HDR processing for our training image datasets. By merging multiples of the same image taken at different exposures HDR provides greater contrast and detail, reduced image artifacts, and a wider range of luminance improving the information provided to the algorithm. The HDR is implemented in two stages. Four images of the same scene are recorded by the camera at different exposures ranging around what the auto exposure would be. Those images are then merged in post processing on a separate computer. Potentially, both steps of the process could be handled by moving to a raspberry pi.
In order to get exactly paired images from both cameras that are captured at the same time, it is necessary to introduce a common trigger. Prior stewards of this project used a lightweight Arduino board for this task. Any Arduino board should be capable of sending this trigger, but an Adafruit Feather 32u4 Radio was used that was available from an earlier project. The board was connected to both cameras and sends a trigger signal to both cameras at the same time. The cameras are programmed to capture an image when they receive the trigger signal. read_external_trigger
For more on the arduino setup go here
Adafruit Feather 32u4 Radio boardCurrently, the camera is set up for HDR 4-exposure bursts push_button_trigger and the wiring has been changed over to a push button. The push button switch is run between P0 and ground.
Camera Setup Back Gimbal model used in this projectIn order to stabilize the images while walking and ensure they are approximately level, a gimbal was used to hold the entire device. The gimbal used for this project was the Hohem iSteady Q. It is a lightweight single-axis gimbal that is able to hold a smartphone.
** For the HDR imaging the gimble was deemed insufficient, a tripod was used instead **
In order to be able to capture approximately the same image, the cameras had to be mounted as close together as possible. Simultaneously, the case must be able to hold the fog surrounding one camera while isolating the other camera from the influence of the fog, keeping all other conditions the same. Therefore, both cameras are arranged side by side, inside separate chambers.
The case was designed in Fusion 360. Some 3D printed files were printed using a Creality Ender 3 Pro, some on an Ultimaker S3. The front plate was lasercut on a CO2 laser cutter.
CAD design of the gimbal mount CAD design of the gimbal bridge Front view of entire CAD design Rear view of entire CAD design** Various portable foggers are available, but ordering and shipping takes time. **
The following components are required for the device:
*Initial purchase
- 2x OpenMV H7 camera
- 1x Adafruit Feather board (or any other microcontroller capable of this task, i.e., any other microcontroller)
- 1x Hohem iSteady Q gimbal
- 2x Toggle switch (any latching switch that can be used to trigger the cameras)
- 1x Breadboard 30x70mm
- 2x Rubber stoppers
- External USB Power Supply
*Secondary purchase
- 1x Push Button Switch
- Various connecting wires
- USB power adapter
- Back box
- Front plate
- Front camera screw terminal
- Gimbal mount
- Gimbal bridge
- Hinge
- Lock body
- Lock catch
- Lock receptor
- Maintenance door with hinge
- Maintenance door brace
- Rear camera standoff
- Top plate
- Wire restraint
Several parts of the CAD model were adopted from different sources. They are attributed in the following:
Part | Source | License |
---|---|---|
OpenMV camera | GrabCAD | unknown |
Adafruit Feather board | Adafruit | MIT |
Prototyping board | GrabCAD | unknown |
Toggle switch | GrabCAD | unknown |
DIN912 M3 25mm screw | 3Dfindit | unknown |
The models were either trained on on a personal laptop computer equipped with 64 GB of RAM, a lab computer equipped with a dedicated GPU (NVIDIA GeForce GTX 970), and 64 GB of RAM or on the University of Utah's Center for High Performance Computing (CHPC) cluster. All models were trained for the default 200 epochs for the pix2pix model. The training time increased along with the size of the dataset. For the final model, the training time was around 20 hours.
descriptions on how to get up and running
click to expand
Clone the repository using git
:
git clone https://github.com/apoll2000/FogEye.git
Navigate into the repository:
cd FogEye
Next, an appropriate Python environment needs to be created. All code was run on Python 3.9.7
. For creating the environment, either conda
or pyenv virtualenv
can be used.
The environment can be created using conda
with:
conda create --name FogEye python=3.9.7
Or using pyenv virtualenv
with:
pyenv virtualenv 3.9.7 FogEye
Then activate the environment with:
conda activate FogEye
Or:
pyenv activate FogEye
Using pip
, the required packages can then be installed. (for conda
environments, execute
conda install pip
before to install pip). The packages are listed in the requirements.txt
and can be installed with:
pip install -r requirements.txt
In case you want to install them manually, the packages include:
numpy
torch
opencv-python
matplotlib
- ...
It is important that you specify the right torch
version if you would like to use your CUDA-enabled GPU to train the model, which will drastically reduce training time. See the PyTorch website for more information.
The dataset is currently being hosted here: TUBCloud. Depending on the further development of the project, this might not be the final storing location.
Place the FogEye_images
folder into the datasets
folder of the repository:
-- datasets
|-- FogEye_images
|-- 2023-08-03-04
|-- A
|-- 01-04_08_23__1.bmp
|-- 01-04_08_23__2.bmp
|-- ...
|-- B
|-- 01-04_08_23__1.bmp
|-- 01-04_08_23__2.bmp
|-- ...
|-- ...
The dataset needs to be prepared for training. This includes transforming the folder structure into one compatible with the pix2pix framework and splitting the dataset into training, validation and testing sets. It can be performed using the following command:
python preprocess_FogEye_dataset.py --dataroot path/to/dataset
The model training can be started using the following command:
python train.py --dataroot path/to/dataset --name name_of_model --model pix2pix --direction BtoA --gpu_ids 0
python test.py --dataroot path/to/dataset --direction BtoA --model pix2pix --name name_of_model
Ample information on the training and testing process and their parameters can be found on the pix2pix GitHub page.
This GitHub page includes several helper scripts to perform different actions like hyperparameter tuning or epoch visualization.
These are: Preprocessing:
preprocess_FogEye_dataset.py
Hyperparameter tuning:hyperparameter_dropoutRate.py
hyperparameter_GAN.py
hyperparameter_init_type.py
hyperparameter_lr_policy.py
hyperparameter_n_layers_D.py
hyperparameter_netD.py
hyperparameter_netG.py
hyperparameter_ngf_ndf.py
hyperparameter_normalization.py
hyperparameter_Res9AndMore.py
hyperparameter_supertraining.py
Visualization:plot_model_results.py
evaluate_model_group.py
click to expand
At the beginning of the project, we experimented with synthetic datasets in combination with the pix2pix model. The datasets used were based on the Cityscapes dataset as well as on images derived from the CARLA simulator. The fog simulations generally work either by directly using a depth map that is available for each particular image, or by using the left and right images to calculate the depths in the images, thus reconstructing this depth map. This depth map helps in estimating how strongly the fog affects different parts of the image.
The datasets in the following are semi-synthetic, meaning that they work with real images, to which the fog has been added synthetically. A disadvantage of this method is that the depth map is never perfect, which can lead to artifacts in the fogged images.
In cooperation with the researchers Georg Volk and Jörg Gamerdinger from the University of Tübingen, Germany, we trained a model on synthetic data generated for their paper "Simulating Photo-realistic Snow and Fog on Existing Images for Enhanced CNN Training and Evaluation".
Another dataset taken into consideration was the Foggy Cityscapes dataset from the paper "Semantic Foggy Scene Understanding with Synthetic Data" by Sakaridis et al.. The dataset was created by the Computer Vision Lab of ETH Zürich, Switzerland.
The following dataset was created entirely synthetically. The original images were rendered using a driving simulator, which generated the matching perfect depth maps as well. This way, the fogged images do not show any artifacts.
This dataset was created by the researchers Georg Volk and Jörg Gamerdinger from the University of Tübingen, Germany, using the same technique from the paper "Simulating Photo-realistic Snow and Fog on Existing Images for Enhanced CNN Training and Evaluation". It is based on the CARLA simulator.
description & details of the collected dataset
Originally, approximately 10,000 images of QVGA(240x320 pixels) quality were collected in the RGB565(5 bits for Red and Blue, 6 bits for Green) format. For the implementation of HDR, 244 images were collected in the original format and combined into 61 HDR images, and 2192 images were collected in VGA(480x640 pixels) quality in the raw sensor data format.- ~10.3k QVGA(240x320) RGB 565 images have been collected
- 2192 raw data images were collected
- 2436 images were collect for compression into 609 HDR images
While HDR processing reduced the total number of images available for training, collection of raw data nullified this reduction through the use of differing levels of gamma correction. By varying the gamma different details are revealed or accentuated.
*ML results on dataset*An initial limitaion to the first iteration of the ML model was the homogeneity of the collected data. While the initial data set was larger than the one collected by the second group of participants on this project, it was collected in a small geographic area and under fairly uniform Utah summer conditions. The second dataset expands diversity of setting and weather significantly.
A second limitation was with the cameras themselves. It is only recently that the cameras have been set to capture data in the raw format, and only at the VGA resolution. All other resolution levels are locked into the RGB565 format. While training at lower resolution for proof of concept is advantageous due to the lower computational time, flexibility in selecting resolution and format is desireable. Additionally, the ability to do the HDR processing in conjunction with image capture would have sped up the process. A platform like the raspberry pi 5 with its capacity to simultaneously operate 2 cameras and then process images in the desired format would have simplified the operation. And at about the same price point as the OpenMV cameras.
The code is licensed under the BSD 3-Clause License, available under CODE_LICENSE. -> this is taken from pyramid pix2pix
The parts of the code that were adopted from the pix2pix project are licensed under ... MAKE SURE NOT TO VIOLATE PIX2PIX BSD LICENSE HERE
The dataset is licensed under the Creative Commons Attribution 4.0 International License, available under DATASET_LICENSE.
-> or should this be CC-BY-NC (non-commercial?)
The hardware is licensed under the CERN Open HArdware License v2 - Weakly Reciprocal (CERN-OHL-W v2), available under HARDWARE_LICENSE.
If you use the dataset or any of the code in this repository created by us, please cite the following paper:
@misc{Welch2024FogEye,
title={FogEye -- Computational DeFogging via Image-to-Image Translation on a real-world Dataset},
author={Chandler Welch},
year={2024}
}
Add our paper here
- [1]: “How to Center and Scale Data Using Ployfit.” Stack Overflow, stackoverflow.com/questions/40569675/how-to-center-and-scale-data-using-ployfit. Accessed 28 Mar. 2024.
- [2]: “What Is the Difference between LAR, and the Bisquare Remain Robust ...” Www.mathworks.com, www.mathworks.com/matlabcentral/answers/\183690-what-is-the-difference-between-lar-and-the-bisquare-remain-robust-in-regression-curve-fitting-tool. Accessed 28 Mar. 2024.
- [3]: Wikipedia Contributors. “Coefficient of Determination.” Wikipedia, Wikimedia Foundation, 27 Feb. 2019, en.wikipedia.org/wiki/Coefficient_of_determination.
click to expand
We conducted a study on how quickly the fog decays in order to know better how often it needs to be replenished. This was done by filling the fog chamber, letting the fog decay and filming the entire decay using both of the cameras. The resulting video of the fogged camera was analyzed by calculating the Variance of the Laplacian of each frame as a metric for the intensity of the fog. You can see that after about 5 minutes, the fog intensity becomes quite low.
Fog decay measurement over timeWe also conducted a study on which metric performed the best over the Variance of the Laplacian, using the coefficient of determination, R^2, to determine goodness of fit. R^2 describes a proportion of variance in the dependent variable or evaluator (y axis) that can be explained by the independent variable or Variance pf the Laplacian (x axis).
Pearson vs LaPlace- Robust fit: None, Center and Scale. MSE vs LaPlace- Robust fit: None, Center and Scale. SSIM vs LaPlace- Robust fit: Bisquare, Center and Scale. CW-SSIM vs LaPlace- Robust fit: Bisquare, Center and Scale. NCC vs LaPlace- Robust fit: Bisquare, Center and Scale.This code defines the High Dynamic Range environment.
''' requirements for this code** pip install opencv-python pip install numpy pip install OpenEXR pip install Imath ''' import os import cv2 import numpy as np import shutil import rawpy import imageio import datetime
#dataroot = './pre_hdr/3_31_24_fog' #output_dataroot = './post_hdr/hdr_images/fog'
def make_dir_hdr(dataroot): ''' this function deals with .raw to .bmp conversion and makes individual directories for each set of 4 images to be processed HDR this is a helper script for ordinizational processes it into also helps with debugging and keeping track of the images as you can easily see which images are in which set and adjust offsets accordingly.''' __temp = (len(os.listdir(dataroot)))//4 for i in range(__temp): os.makedirs(f'{dataroot}/{i}' , exist_ok=True)
for file in os.listdir(dataroot):
# if files are in raw format they are converted to bmp
if file.endswith('.raw'):
raw = rawpy.imread(f'{dataroot}/{file}')
temp = file.strip('.raw')
temp = int(temp)
temp = (temp-1)//4
rgb = raw.postprocess()
new_path = f'{dataroot}/{temp}/{file.strip('.raw')}.bmp'
imageio.imsave(new_path, rgb)
if file.endswith('.bmp'): # for if files are already in bmp format
temp = file.strip('.bmp')
temp = int(temp)
temp = (temp-1)//4
shutil.move(f'{dataroot}/{file}', f'{dataroot}/{temp}/{file}')
print(datetime.date.today())
#dataroot = "./pre_hdr/3_31_24_fog" #out_dataroot = "./post_hdr/hdr_images"
def create_hdr_images(dataroot, out_dataroot='./post_hdr/hdr_images', HDR_type='Debevec'): ''' This function creates HDR images from the images in the dataroot folder ''' exposure = np.array([0.5, 0.75, 1.5, 2], dtype=np.float32) _temp = [] for file in os.listdir(dataroot): if file.endswith('.txt'): _temp.append(file) if not len(_temp) == 1: raise AssertionError("\nThere should be only one .txt file in the dataroot folder.\nThis .txt file should contain the exposure times for the images in the folder.\nif you dont have the exposure times consider changing the HDR_type to 'Mertens' or 'Robertson'.\n")
with open(f'{dataroot}/{_temp[0]}', 'r') as f:
exposure_ts = f.readlines()
if not os.path.exists(out_dataroot):
os.makedirs(out_dataroot)
# Assuming each image set to be processed into HDR is in a separate subfolder
for subdir in os.listdir(dataroot):
subdir_path = os.path.join(dataroot, subdir)
if os.path.isdir(subdir_path):
#print(subdir_path.split('\\')[-1])
temp_idx = int(subdir_path.split('\\')[-1])
#assert True == False
images = []
for filename in sorted(os.listdir(subdir_path), key=lambda x: int(x.split('.')[0])):
file_path = os.path.join(subdir_path, filename)
im = cv2.imread(file_path, cv2.IMREAD_ANYDEPTH | cv2.IMREAD_COLOR) # Adjust if your images are not standard 8-bit or 16-bit images
if im is not None:
images.append(im)
if len(images) > 0:
# Align input images
alignMTB = cv2.createAlignMTB()
alignMTB.process(images, images)
# Obtain Camera Response Function (CRF)
exposure_times = exposure*float(exposure_ts[temp_idx])*0.000001
# 1 - MergeDebevec
if HDR_type == 'Debevec':
if not os.path.exists(f"{out_dataroot}/vga"):
os.makedirs(f"{out_dataroot}/vga")
os.makedirs(f"{out_dataroot}/vga/22")
os.makedirs(f"{out_dataroot}/vga/44")
os.makedirs(f"{out_dataroot}/vga/10")
# Note: HDR images have a high dynamic range that cannot be properly displayed on standard monitors
# without tone mapping. Here, we'll just visualize a tonemapped version for simplicity.
calibrateDebevec = cv2.createCalibrateDebevec()
responseDebevec = calibrateDebevec.process(images, exposure_times)
mergeDebevec = cv2.createMergeDebevec()
hdrDebevec = mergeDebevec.process(images, exposure_times, responseDebevec)
hdr_filename = os.path.join(out_dataroot, f"vga/vga_hdr_{datetime.date.today()}_{subdir}.hdr")
cv2.imwrite(hdr_filename, hdrDebevec.copy())
# Save your HDR tonemapped images at 3 different gamma values
hdr_filename = os.path.join(out_dataroot, f"vga/22/vga_hdr_{datetime.date.today()}_{subdir}_22.bmp")
tonemapped = cv2.createTonemap(2.2).process(hdrDebevec.copy())
tonemapped = np.clip(tonemapped*255, 0, 255).astype('uint8')
cv2.imwrite(hdr_filename, tonemapped)
if subdir in '1':
cv2.imshow('HDR Image', tonemapped)
cv2.waitKey()
tonemapped = cv2.createTonemap(4.4).process(hdrDebevec.copy())
tonemapped = np.clip(tonemapped*255, 0, 255).astype('uint8')
hdr_filename = os.path.join(f"{out_dataroot}", f"vga/44/vga_hdr_{datetime.date.today()}_{subdir}_44.bmp")
cv2.imwrite(hdr_filename, tonemapped)
tonemapped = cv2.createTonemap(1.0).process(hdrDebevec.copy())
tonemapped = np.clip(tonemapped*255, 0, 255).astype('uint8')
hdr_filename = os.path.join(out_dataroot, f"vga/10/vga_hdr_{datetime.date.today()}_{subdir}_10.bmp")
cv2.imwrite(hdr_filename, tonemapped)
# 2 - MergeRobertson has not been tested ###################
if HDR_type == 'Robertson':
mergeRobertson = cv2.createMergeRobertson()
hdrRobertson = mergeRobertson.process(images, exposure_times)
hdr_filename = os.path.join(out_dataroot, f"hdr_{datetime.date.strftime("%Y-%m-%d")}_{subdir}.bmp")
hdrRobertson = cv2.createTonemap(2.2).process(hdrRobertson.copy())
#hdrRobertson = cv2.normalize(hdrRobertson, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)
cv2.imwrite(hdr_filename, hdrRobertson.copy())
# 3 - MergeMertens Does not require exposure times
if HDR_type == 'Mertens':
mergeMertens = cv2.createMergeMertens()
hdrMertens = mergeMertens.process(images)
res_16bit = np.clip(hdrMertens*255, 0, 255).astype('uint8')
hdr_filename = os.path.join(out_dataroot, f"hdr_{datetime.date.strftime("%Y-%m-%d")}_{subdir}.bmp")
cv2.imwrite(hdr_filename, res_16bit)
# Display the HDR image
print(f"Saved HDR image to {hdr_filename}")
cv2.destroyAllWindows()
if name == 'main': parser = argparse.ArgumentParser() parser.add_argument('--dataroot', type=str, required=True, help='The root directory of the dataset') parser.add_argument('--path_no_fog', type=str, required=True, help='The path to the images without fog') args = parser.parse_args() make_dir_hdr(args.dataroot) create_hdr_images(args.dataroot, out_dataroot=args.path_no_fog, HDR_type='Debevec')