This repository is a case study and proof-of-concept for leveraging Copy-and-Paste Augmentation (CPA) to perform object detection and instance segmentation on insect collection boxes (e.g. integrated into Inselect). It was a university project of the Lab "Intelligent Vision Systems" during the summer term 2021 at the University of Bonn.
Instances are obtained from annotated full-sized or pre-cropped images that are recombined in front of a realistic background (in this case different empty collection boxes).
Fully random instance placements (R-CPA, left) and a placement pattern imitating real collection boxes (CB-CPA, right) were implemented.
R-CPA | CB-CPA |
---|---|
The project was build on FAIR's dectron2 and used the TensorMask sliding-window instance segmentation model.
The original data set only consisted of 3 unlabeled images of collection boxes containing bugs. As manual labeling turned out to be unfeasibly labor-intensive, the public dataset of Hansen et al., 2019 containing > 60,000 cropped bug images distributed between a variety of species was used. This data could be annotated using a rather simple intensity-based thresholding pipeline and a train-test iteration of TensorMask (rather an overkill, but the code was already there...). The created annotations are available here. These instances were used for training, whereas the partially annotated collection-box images were used for validation.
The validation accuracies obtained using differnt CPA and training settings on the annotated crops (crop) and iamges imitating full-sized collection-box images (stitch) from the 3 validation images are as follows:
setting | CB-CAP prob | R-CAP prob | LR | BN momemtum | poolsize | additional modification | segm. AP crop | bbox AP crop | segm. AP stitch | bbox AP stitch |
---|---|---|---|---|---|---|---|---|---|---|
base | 0.5 | 0.5 | 0.001 | 0.9 | 15 | 56.7 | 70.5 | 27.1 | 36.3 | |
lower LR | 0.5 | 0.5 | 0.0002 | 0.99 | 15 | reduced LR (0.0002) and increased BN momentum (0.99) | 60.5 | 69.4 | 28.2 | 34.6 |
R-CAP only | 0 | 1 | 0.001 | 0.9 | 30 | 56.7 | 67.2 | 27.6 | 34.6 | |
CB-CPA only | 1 | 0 | 0.001 | 0.9 | 30 | 55.6 | 68.7 | 27.3 | 37 | |
alpha blending | 0.5 | 0.5 | 0.001 | 0.9 | 15 | alpha blending added to CB-CPA | 58.3 | 70.9 | 27.3 | 36.5 |
scale augm. | 0.5 | 0.5 | 0.001 | 0.9 | 15 | scale CB- and R-CPA instances by a factor drawn from [0.85, 1.175] and [0.6, 1.66], respectively | 58.5 | 70.2 | 27.2 | 36 |
small train set (1) | 0.5 | 0.5 | 0.001 | 0.9 | 3 | 57.5 | 69.2 | 27.5 | 36.2 | |
small train set (2) | 0.5 | 0.5 | 0.001 | 0.9 | 3 | 5-fold increased count of augmentations per instance | 58.4 | 69.9 | 27.4 | 36.2 |
For more detailed explanations and results see the report.
All obtained models generalized very well to the beetle data published alongside Inselect, however, no ground truth and quantitative scores of Inselect's current object detection methods are provided.
CB-CPA trained TensorMask model | Inselect |
---|---|