This repository contains materials accompanying a series of articles “Bird by Bird Tech” published on Medium.
Here, we are going to tackle such an established problem in computer vision as fine-grained classification of bird species. The first part of the tutorials demonstrates how to use CNN models to classify bird images based on the Caltech-UCSD Birds-200-2011 (CUB-200-2011) dataset using PyTorch. By the end of these tutorials, you will be able to:
- Understand basics of image classification problem of bird species.
- Determine the data-driven image pre-processing strategy.
- Create your own deep learning pipeline for image classification.
- Build, train and evaluate ResNet-50 model to predict bird species.
- Enhance CNN's performance by using different techniques.
Here you can get familiarized with the content more properly:
- Part 1: “Advancing CNN model for fine-grained classification of birds” (notebook, article).
- Part 2: “Finite automata simulation for leveraging AI-assisted systems“ (notebook, article, tutorial).
- Part 3: “Optimizing AI-based systems on object detection using Monte-Carlo“ [TBA].
- Part 4: “Interpretable deep learning for computer vision“ [TBA].
- Part 5: “Multimodal data fusion approach for bird classification“ [TBA].
Part 1 demonstrates how to perform the data-driven image pre-processing, to build a baseline ResNet-based classifier, and to further improve it's performance for bird classification using different approaches. Results indicate that the final variant of the ResNet-50 model advanced with transfer and multi-task learning, as well as with the attention module greatly contributes to the more accurate bird predictions. Part 2 focuses on simulation modelling using finite state machines for AI-assisted computer vision systems towards improved efficiency on bird detection. More information on experimental design and results can be found in notebooks and articles.
Before running the code, make sure to install project dependencies indicated in the requirements file.
Except as otherwise noted, the content of this repository is licensed under the Creative Commons Attribution Non Commercial 4.0 International, and code samples are licensed under the Apache 2.0 License. All materials can be freely used, distributed and adapted for non-commercial purposes only, given appropriate attribution to the licensor and/or the reference to this repository.
SPDX-License-Identifier: CC-BY-NC-4.0 AND Apache-2.0