This repository contains a collection of research papers, an evaluation toolbox, and benchmarking results for the task of concealed object segmentation (COS) in images. Besides, to evaluate the generalizability of COS approaches, we re-organize a concealed defect segmentation dataset named CDS2K.
- Paper link: arXiv
- This project is under construction. Contributions are welcome! If you would like to contribute to this repository, please submit a pull request.
Concealed scene understanding (CSU) is a hot computer vision topic aiming to perceive objects with camouflaged properties. The current boom in its advanced techniques and novel applications makes it timely to provide an up-to-date survey to enable researchers to understand the global picture of the CSU field, including both current achievements and major challenges.
Figure 1: Sample gallery of concealed scenarios. (a-d) show natural animals. (e) depicts a concealed human in art. (f) features a synthesized ``lion''.
This paper makes four contributions:
- For the first time, we present a comprehensive survey of the deep learning techniques oriented at CSU, including a background with its taxonomy, task-unique challenges, and a review of its developments in the deep learning era via surveying existing datasets and deep techniques.
- For a quantitative comparison of the state-of-the-art, we contribute the largest and latest benchmark for Concealed Object Segmentation (COS).
- To evaluate the transferability of deep CSU in practical scenarios, we re-organize the largest concealed defect segmentation dataset termed CDS2K with the hard cases from diversified industrial scenarios, on which we construct a comprehensive benchmark.
- We discuss open problems and potential research directions for this community.
We introduce a taxonomy of seven popular CSU tasks. Please refer to Section 2.1 of our paper for more details.
- Five of these are image-level tasks: (a) concealed object segmentation (COS), (b) concealed object localization (COL), (c) concealed instance ranking (CIR), (d) concealed instance segmentation (CIS), and (e) concealed object counting (COC).
- The remaining two are video-level tasks: (f) video concealed object segmentation (VCOS) and (g) video concealed object detection (VCOD).
We illustrate each task with its corresponding annotation visualization.
Figure 2: Illustration of representative CSU tasks.
We recap the latest image-based research that includes 50 papers.
Table 1: Essential characteristics of reviewed video-level CSU methods.
We also review recent nine video-based research
Table 2: Essential characteristics of reviewed video-level CSU methods.
The following are ten datasets collected for several CSU-related tasks.
Table 3: Essential characteristics of reviewed video-level CSU methods.
Our benchmarking is built on COS tasks since this topic is relatively well-established and offers a variety of competing approaches. WHAT DO WE PROVIDE HERE?
- First, we provide a one-key evaluation toolbox for CSU. Please the follow instructions and then you will get the results.
- Second, we run COS approaches on three popular benchmarks (CAMO, NC4K, and COD10K) and organize them into the standard format (*png) Google Drive, 1.16GB. The collection of these prediction masks is public here (Google Drive, 4.82GB) for convenient research.
- The benchmark results on nine evaluation metrics are reported in the next three tables. You can find the text file here.
Table 4: Quantitative comparison of CAMO testing set.
Table 5: Quantitative comparison on NC4K testing set.
Table 6: Quantitative comparison of COD10K testing set.
- Lastly, we provide the attribute-based analyses on the COD10K dataset
Figure 3: Qualitative results of ten COS approaches. For more descriptions of visual attributes in each column refer to Section 5.6 of the paper.
We organize a concealed defect segmentation dataset (Google Drive, 159MB) from the five well-known defect segmentation databases. As shown in Figure 4, we present five sub-databases: (a-l) MVTecAD, (m-o) NEU, (p) CrackForest, (q) KolektorSDD, and (r) MagneticTile. The defective regions are highlighted with red rectangles. (Top-Right) Word cloud visualization of CDS2K. (Bottom) The statistic number of positive/negative samples of each category in our CDS2K.
Figure 4: Sample gallery of our CDS2K.
The average ratio of defective regions for each category is presented in Table 7, which indicates that most of the defective regions are relatively small
Table 7: Sample gallery of our CDS2K.
Next, we report the quantitative comparison on the positive samples of CDS2K. Kindly download the result map on Google Drive (116.6MB).
Table 8: SQuantitative comparison on the positive samples of CDS2K.
Please cite our paper if you find the work useful:
@article{fan2023csu,
title={Advances in Deep Concealed Scene Understanding},
author={Fan, Deng-Ping and Ji, Ge-Peng and Xu, Peng and Cheng, Ming-Ming and Sakaridis, Christos and Van Gool, Luc},
journal={Visual Intelligence (VI)},
year={2023}
}