Pytorch implementation of our ICCV 2021 paper with the pre-trained model used to generate the results presented in the publication.
If you use this work please cite:
@InProceedings{Vojir_2021_ICCV,
author = {Vojir, Tomas and \v{S}ipka, Tom\'a\v{s} and Aljundi, Rahaf and Chumerin, Nikolay and Reino, Daniel Olmeda and Matas, Jiri},
title = {Road Anomaly Detection by Partial Image Reconstruction With Segmentation Coupling},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
pages = {15651-15660}
}
- 2023.04.14 💥 Follow up (new) version of this method is available HERE
The method consists of two main components:
- A semantic segmentation network. Currently, a DeepLab v3 architecture is used (code adopted from jfzhang95 distributed under MIT licence). The backbone is ResNet-101 pre-trained on ImageNet. It was trained on CityScapes dataset and the network weights are fixed.
- An anomaly estimation network. It is a standalone module that uses the features extracted from the ResNet-101 backbone and the output of the segmentation network before softmax normalization.
The configuration of the network architecture, as published in ICCV2021, is defined in the
configuration file parameters.yaml
. The specific model is loaded
dynamically based on its string name (see MODEL.NET
variable). The same
applies for the loss. The network implementation is located in
./net/models.py
.
DISCLAIMER: This is a research code. There is lot of unused and cluttered code. You running this code means you will not blame the author(s) if this breaks your stuff. This code is provided AS IS without warranty of any kind.
All configurations of training is done through the configuration file
./config/defauls.py
or saved configurations of particular network
configurations. To re-create the training of the proposed architecture use the
parameters.yaml
as a configuration file. Change the training/val/testing data
sources if needed.
The training datasets are set in the DATASET.TRAIN
and DATASET.VAL
. They
are a string variables and currently can be from this list
['cityscapes_2class']
for training
and ['cityscapes_2class', 'LaF']
for validation.
The training can be run on specific GPU as (using default configuration from ./config/defauls.py
)
CUDA_VISIBLE_DEVICES=<GPU_ID> python3 train.py
or using custom settings, e.g. saved from custom experiment:
CUDA_VISIBLE_DEVICES=<GPU_ID> python3 train.py --exp_cfg="./path/to/config_file.yaml"
Currently only three labels are used, label 0 for anomaly, label 1 for road and
255 for void. The dataloaders needs to provide the gt segmentation using these
labels only. For example, see e.g.
./dataloaders/datasets/cityscapes_2class.py
.
The datasets loaders are located in ./dataloaders/datasets/
. Each dataset has
its own dataloader class.
The path to datasets data are stored in ./mypath.py
where the identification
is a string that is then used in the configuration file to set DATASET.TRAIN
and
in the ./dataloaders/__init__.py
where the dataset are instantiated.
To add new dataset:
- path to its data needs to be added to
./mypath.py
. - dataloader class needs to be implemented and stored in
./dataloaders/datasets/
. - it instantiation needs to be defined in
./dataloaders/__init__.py
- the
DATASET.TRAIN
orDATASET.VAL
needs to be set to the new dataset name in the configuration file
For the testing, the ./ReconAnon.py
script is used (see the file for
a minimal example). The exp_dir
parameter needs to be set to point to a root
directory where the code
directory and parameters.yaml
are located. The
inserted path on line 6 in ./ReconAnon.py
need to be set to point to the
code/config
directory. The evaluate
function expect a tensor with size [1, C, H,
W] (i.e. batch size of 1) where the image is normalized into [0,1] range.
There are two pre-trained models:
- The semantic segmentation model (fixed, does not need to be modified).
The path to the checkpoint
checkpoint-segmentation.pth
needs to be set in the configuration fileMODEL.RECONSTRUCTION.SEGM_MODEL
variable. Download from gdrive_segmentation_model. - The model of the anomaly detection network (either train or use pre-trained)
<GITREPO/code/checkpoints/>checkpoint-best.pth
. This model used in the publication was trained using the parameters provided in theparameters.yaml
configuration file. It used CityScapes datasets for training and LaF training data for validation. The pre-trained model is available here.
The performance is evaluated on the road region using two pixel-wise metrics: Average Precision (AP) = Area under Precision-Recall Curve, and False Positive Rate @ 95% True Positive Rate (FPR@95) = False Positive Rate at operating point where the True Positive Rate is 95%. In the Table the results are shown as AP / FPR@95 for each dataset. Note the significant improvement on the "harder" datasets (RO, RO21).
LaF | LaF-train | FS | RA | RO | OT | |
---|---|---|---|---|---|---|
JSR-Net (ICCV 2021) | 79.4 / 4.3 | 87.8 / 1.7 | 79.3 / 4.7 | 93.4 / 8.9 | 79.8 / 0.9 | 28.1 / 28.7 |
Datasets used for evaluation:
- [0] LaF - Lost and Found dataset Testing split
- [0] LaF-train - Lost and Found dataset Training split (this was used as a validation dataset during training)
- [1] RA - RoadAnomaly
- [2] RO - RoadObstacles
- [3] OT - Obstacle Track
- [4] FS - FishyScapes dataset (subset of Lost and Found, for backward results comparability)
[0] P. Pinggera, S. Ramos, S. Gehrig, U. Franke, C. Rother, and R. Mester. Lost and Found: detecting small road hazards for self-driving vehicles. In International Conference on Intelligent Robots and Systems (IROS), 2016.
[1] K. Lis, K. Nakka, P. Fua, and M. Salzmann. Detecting the Unexpected via Image Resynthesis. In Int. Conf. Comput. Vis., October 2019.
[2] Krzysztof Lis, Sina Honari, Pascal Fua, and Mathieu Salzmann. Detecting Road Obstacles by Erasing Them, 2020.
[3] SegmentMeIfYouCan benchmark
[4] H. Blum, P. Sarlin, J. Nieto, R. Siegwart, and C. Cadena. Fishyscapes: A Benchmark for Safe Semantic Segmentation in Autonomous Driving. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pages 2403–2412, 2019.
Copyright (c) 2021 Toyota Motor Europe
Patent Pending. All rights reserved.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License