Pytorch implementation of our WACV 2023 paper with the pre-trained model used to generate the results presented in the publication.
If you use this work please cite:
@InProceedings{Vojir_2023_WACV,
author = {Voj{\'\i}\v{r}, Tom\'a\v{s} and Matas, Ji\v{r}{\'\i}},
title = {Image-Consistent Detection of Road Anomalies As Unpredictable Patches},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2023},
pages = {5491-5500}
}
- 2024.05.29 💥 New model available! It utilize DINOv2 backbone for feature extraction and semantic segmentation. See Model Section.
The method consists of three main components:
- A semantic segmentation network. Currently, a DeepLab v3 architecture is used (code adopted from jfzhang95 distributed under MIT licence). The backbone is ResNet-101 pre-trained on ImageNet. It was trained on CityScapes dataset and the network weights are fixed.
- An inpainting network adopted from csqiangwen and merged to single file module.
- An anomaly estimation network. It is a standalone module that uses the features extracted from the ResNet-101 backbone and the output of the segmentation network before softmax normalization.
The configuration of the network architecture, as published in WACV2023, is
defined in the configuration file parameters.yaml
. The specific model is
loaded dynamically based on its string name (see MODEL.NET
variable). Set the
model to DeepLabEmbeddingGlobalOnlySegmFullRes
for faster version without the
inpainting module - see paper for details. The network implementation is
located in ./net/models.py
.
DISCLAIMER: This is a research code. There is lot of unused and cluttered code. You running this code means you will not blame the author(s) if this breaks your stuff. This code is provided AS IS without warranty of any kind.
All configurations of training is done through the configuration file
./config/defauls.py
or saved configurations of particular network
configuration. To re-create the training of the proposed architecture use the
parameters.yaml
as a configuration file. Change the training/val/testing data
sources if needed.
The training datasets are set in the DATASET.TRAIN
and DATASET.VAL
. They
are a string variables and currently can be from this list
['cityscapes_2class', 'citybdd100k_2class', 'bdd100k_2class']
for training
and ['cityscapes_2class', 'citybdd100k_2class', 'bdd100k_2class', 'LaF']
for
validation.
The training can be run on specific GPU as (using default configuration from ./config/defauls.py
)
CUDA_VISIBLE_DEVICES=<GPU_ID> python3 train.py
or using custom settings, e.g. saved from custom experiment:
CUDA_VISIBLE_DEVICES=<GPU_ID> python3 train.py --exp_cfg="./path/to/config_file.yaml"
Currently only three labels are used, label 0 for anomaly, label 1 for road and
255 for void. The dataloaders needs to provide the gt segmentation using these
labels only. For example, see e.g.
./dataloaders/datasets/cityscapes_2class.py
.
The datasets loaders are located in ./dataloaders/datasets/
. Each dataset has
its own dataloader class.
The path to datasets data are stored in ./mypath.py
where the identification
is a string that is then used in the configuration file to set DATASET.TRAIN
and
in the ./dataloaders/__init__.py
where the dataset are instantiated.
To add new dataset:
- path to its data needs to be added to
./mypath.py
. - dataloader class needs to be implemented and stored in
./dataloaders/datasets/
. - it instantiation needs to be defined in
./dataloaders/__init__.py
- the
DATASET.TRAIN
orDATASET.VAL
needs to be set to the new dataset name in the configuration file
For the testing, the code/ReconAnon.py
script is used (see the file for
a minimal example). The exp_dir
parameter needs to be set to point to a root
directory where the code
directory and parameters.yaml
are located.
The evaluate
function expect a tensor with size [1, C, H,
W] (i.e. batch size of 1) where the image is normalized into [0,1] range.
There are three pre-trained models:
- The semantic segmentation model (fixed, does not need to be modified).
The path to the checkpoint
checkpoint-segmentation.pth
needs to be set in the configuration fileMODEL.RECONSTRUCTION.SEGM_MODEL
variable. Download from gdrive_segmentation_model. - The inpainting model (fixed, does not need to be modified).
The path to the checkpoint
deepfillv2_WGAN_G_epoch40_batchsize4.pth
needs to be set in the configuration fileMODEL.INPAINT_WEIGHTS_FILE
variable. Download from gdrive_inpaint_model. - Put the model of the anomaly detection network (either trained or use pre-trained)
into
<GITREPO/code/checkpoints/>checkpoint-best.pth
or set path in theparameters.yaml
fileMODEL.RESUME_CHECKPOINT
to absolute path to the checkpoint file. The model used in the publication was trained using the parameters provided in theparameters.yaml
configuration file. It used CityScapes+BDD100k datasets for training and LaF training data for validation. Download from gdrive_dacup_model (or model without the inpainting part gdrive_dacup_w/o_inpaint_model).
New anomaly detection model with DINOv2 backbone is available
here
with semantic segmentation part
here
and the corresponding configuration is provided in parameters_dinov2.yaml
.
For setup follow the instruction from 3). NOTE that ReconAnom.py
file
for evaluation expects parameters in file parameters.yaml
so if you want
to use the DINOv2 model either modify ReconAnom.py
file (around line
31) or rename the parameters_dinov2.yaml
to parameters.yaml
.
The performance is evaluated on the road region using two pixel-wise metrics: Average Precision (AP) = Area under Precision-Recall Curve, and False Positive Rate @ 95% True Positive Rate (FPR@95) = False Positive Rate at operating point where the True Positive Rate is 95%. In the Table the results are shown as AP / FPR@95 for each dataset. Note the significant improvement on the "harder" datasets (RO, RO21).
LaF | LaF-train | FS | RA | RO | OT | |
---|---|---|---|---|---|---|
JSR-Net (ICCV 2021) | 79.4 / 4.3 | 87.8 / 1.7 | 79.3 / 4.7 | 93.4 / 8.9 | 79.8 / 0.9 | 28.1 / 28.7 |
DaCUP w/o inpaint (WACV 2023) | 85.1 / 2.1 | --- | 88.8 / 1.7 | 94.3 / 6.8 | 90.3 / 0.17 | --- |
DaCUP (WACV 2023) | 84.5 / 2.6 | --- | 89.7 / 1.4 | 96.2 / 5.5 | 94.3 / 0.08 | 81.5 / 1.1 |
DINOv2 DaCUP | --- | --- | 93.3 / 0.6 | 98.6 / 2.4 | 90.7 / 0.4 | 83.6 / 1.4 |
Datasets used for evaluation:
- [0] LaF - Lost and Found dataset Testing split
- [0] LaF-train - Lost and Found dataset Training split (this was used as a validation dataset during training)
- [1] RA - RoadAnomaly
- [2] RO - RoadObstacles
- [3] OT - Obstacle Track
- [4] FS - FishyScapes dataset (subset of Lost and Found, for backward results comparability)
[0] P. Pinggera, S. Ramos, S. Gehrig, U. Franke, C. Rother, and R. Mester. Lost and Found: detecting small road hazards for self-driving vehicles. In International Conference on Intelligent Robots and Systems (IROS), 2016.
[1] K. Lis, K. Nakka, P. Fua, and M. Salzmann. Detecting the Unexpected via Image Resynthesis. In Int. Conf. Comput. Vis., October 2019.
[2] Krzysztof Lis, Sina Honari, Pascal Fua, and Mathieu Salzmann. Detecting Road Obstacles by Erasing Them, 2020.
[3] SegmentMeIfYouCan benchmark
[4] H. Blum, P. Sarlin, J. Nieto, R. Siegwart, and C. Cadena. Fishyscapes: A Benchmark for Safe Semantic Segmentation in Autonomous Driving. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pages 2403–2412, 2019.
Copyright (c) 2021 Toyota Motor Europe
Patent Pending. All rights reserved.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License