An unofficial PyTorch implementation of Kim et al. "Representative Color Transform for Image Enhancement", ICCV2021.
For more information about our implementation, you can also read our blog.
In Kim et al. (2021), a novel image enhancement approach is introduced, namely Representative Color Transforms, yielding large capacities for color transformations. The overall proposed network comprises of four components: encoder, feature fusion, global RCT, and local RCT and is depicted in Figure 1. First the encoder is utilized for extracting high-level context information, which in is in turn leveraged for determining representative and transformed (in RGB) colors for the input image. Subsequently, an attention mechanism is used for mapping each pixel color in the input image to the representative colors, by computing their similarity. The last step involves the application of representative color transforms using both coarse- and fine-scale features from the feature fusion component to obtain enhanced images from the global and local RCT modules, which are combined to produce the final image.
Figure 1: An overview of RCTNet's architecture.
The LOw-Light (LOL) dataset [Wang et al. (2004)] for image enhancement in low-light scenarios was used for the purposes of our experiment. It is composed of a training partition, containing 485 pairs of low- and normal-light image pairs, and a test partition, containing 15 such pairs. All the images have a resolution of 400×600. For the purposes of training, all images were randomly cropped and rotated by a multiple of 90 degrees.
The results, in terms of the PSNR and SSIM evaluation metrics, calculated for our implementation of RCTNet are depicted in Table 1, along with results of competing image enhancement methods and the official implementation of RCTNet, as reported in Kim et al. (2021). It becomes evident that our results do not approximate those reported for the official implementation for both examined metrics.
Method | PSNR | SSIM |
---|---|---|
NPE [Wang et al. (2013)] | 16.97 | 0.589 |
LIME [Guo et al. (2016)] | 15.24 | 0.470 |
SRIE [Fu et al. (2016)] | 17.34 | 0.686 |
RRM [Li et al. (2016)] | 17.34 | 0.686 |
SICE [Cai et al. (2018)] | 19.40 | 0.690 |
DRD [Wei et al. (2018)] | 16.77 | 0.559 |
KinD [Zhang et al. (2019)] | 20.87 | 0.802 |
DRBN [Yang et al. (2020)] | 20.13 | 0.830 |
ZeroDCE [Guo et al. (2020)] | 14.86 | 0.559 |
EnlightenGAN [Jiang et al. (2021)] | 15.34 | 0.528 |
RCTNet [Kim et al. (2021)] | 22.67 | 0.788 |
RCTNet (ours)* | 19.96 | 0.768 |
RCTNet + BF [Kim et al. (2021)] | 22.81 | 0.827 |
Table 1: Quantitative comparison on the LoL dataset. The best results are boldfaced.
Interestingly, the results of Table 1 deviate significantly in case the augmentations proposed by the authors (random cropping and random rotation by a multiple of 90 degrees) are also used during the evaluation. This finding indicates that the model favours augmented images, since during training we performed augmentation operations on all input images and for every epoch. While the authors refer to the same augmentations, they do not specify the frequency, with which those augmentations were performed. This phenomenon becomes more evident by looking at the quantitative results, when augmentations were used on the test images, as shown in Table 2.
Evaluation Metric | Mean | Standard Deviation | Max | Min |
---|---|---|---|---|
PSNR | 20.522 | 0.594 | 22.003 | 18.973 |
SSIM | 0.816 | 0.009 | 0.839 | 0.787 |
Table 2: Mean, standard deviation, maximum, and
minimum values for PSNR and SSIM, for 100 executions with
different random seeds, when augmentations are also included in the test set, using our implementation of RCTNet.
Some image enhancement results of the implemented RCTNet are shown below, compared to the low-light input images and the ground-truth normal-light output images. From these examples it becomes evident that RCTNet has successfully learned how to enhance low-light images, achieving comparable results to the ground-truth images in terms of exposure and color-tones. Nevertheless, the produced images are slightly less saturated with noise being more prominent. It was conjectured that by training the network for more epochs, some of these limitations could be alleviated. It is also observed that RCTNet fails to extract certain representative colors that are only available in small regions of the input image (e.g. the green color for the 4th image).
Input | RCTNet | Ground-Truth |
---|---|---|
$ python train.py -h
usage: train.py [-h] --images IMAGES --targets TARGETS [--epochs EPOCHS] [--batch_size BATCH_SIZE] [--lr LR]
[--weight_decay WEIGHT_DECAY] [--config CONFIG] [--checkpoint CHECKPOINT]
[--checkpoint_interval CHECKPOINT_INTERVAL] [--device {cpu,cuda}]
options:
-h, --help show this help message and exit
--images IMAGES Path to the directory of images to be enhanced
--targets TARGETS Path to the directory of groundtruth enhanced images
--epochs EPOCHS Number of epochs
--batch_size BATCH_SIZE
Number of samples per minibatch
--lr LR Initial Learning rate of Adam optimizer
--weight_decay WEIGHT_DECAY
Weight decay of Adam optimizer
--config CONFIG Path to configurations file for the RCTNet model
--checkpoint CHECKPOINT
Path to previous checkpoint
--checkpoint_interval CHECKPOINT_INTERVAL
Interval for saving checkpoints
--device {cpu,cuda} Device to use for training
- Download the official LoL dataset from here.
- Download the weights for our pre-trained model from here.
- Use the eval.py script.
$ python eval.py -h
usage: eval.py [-h] --images IMAGES --targets TARGETS --save SAVE --checkpoint CHECKPOINT [--config CONFIG]
[--batch_size BATCH_SIZE] [--nseeds NSEEDS] [--device {cpu,cuda}]
options:
-h, --help show this help message and exit
--images IMAGES Path to the directory of images to be enhanced
--targets TARGETS Path to the directory of groundtruth enhanced images
--save SAVE Path to the save plots and log file with metrics
--checkpoint CHECKPOINT
Path to the checkpoint
--config CONFIG Path to configurations file for the RCTNet model
--batch_size BATCH_SIZE
Number of samples per minibatch
--nseeds NSEEDS Number of seeds to run evaluation for, in range [0 .. 1000]
--device {cpu,cuda} Device to use for training
- Download the weights for our pre-trained model from here.
- Use the enhance.py script.
$ python enhance.py -h
usage: enhance.py [-h] --image IMAGE --checkpoint CHECKPOINT [--config CONFIG] [--batch_size BATCH_SIZE]
[--device {cpu,cuda}]
options:
-h, --help show this help message and exit
--image IMAGE Path to an image or a directory of images to be enhanced
--checkpoint CHECKPOINT
Path to previous checkpoint
--config CONFIG Path to configurations file for the RCTNet model
--batch_size BATCH_SIZE
Number of samples per minibatch
--device {cpu,cuda} Device to use for training
@INPROCEEDINGS{9710400,
author={Kim, Hanul and Choi, Su-Min and Kim, Chang-Su and Koh, Yeong Jun},
booktitle={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
title={Representative Color Transform for Image Enhancement},
year={2021},
volume={},
number={},
pages={4439-4448},
doi={10.1109/ICCV48922.2021.00442}
},
@inproceedings{Chen2018Retinex,
title={Deep Retinex Decomposition for Low-Light Enhancement},
author={Chen Wei, Wenjing Wang, Wenhan Yang, Jiaying Liu},
booktitle={British Machine Vision Conference},
year={2018},
}