Automatically mine stroke rates from underwater video of swimmers over varying windows of time resolution.
Report Bug
·
Request Feature
This project was realized as a thesis to obtain a Master of Engineering in Computer Science at Ghent University. For a comprehensive explanation of methods and results, please consult the academic summary.
Advances in the field of human pose estimation have significantly improved performance across complex datasets. However, current solutions that were designed and trained to recognize the human body across a wide range of contexts, e.g. MS COCO, often do not reach their full potential in very specific and challenging environments. This impedes subsequent analysis of the results. Underwater footage of competitive swimmers is an example of this, due to frequent self-occlusion of body parts and the presence of noise in the water. This work aims to improve the performance of pose estimation in this context in order to enable an automatic analysis of kinematics. Therefore, we propose a framework that limits the search space for human pose estimation by using a set of anchor poses. More specifically, the problem is reduced to finding the best matching anchor pose and the optimal transformation thereof. To find this best match, we devise a method of assessing similarity between two poses and use the Viterbi algorithm to find the most likely sequence of anchor poses. Thereby, we effectively exploit the cyclic character of the swimming motion. This does not only improve pose estimation performance but also provides a method to reliably extract the stroke frequency, outperforming manual timings by a human observer.
The proposed framework consists of 3 main steps
-
Baseline Prediction: Estimation of 13 keypoints by an existing human pose estimation model (prefe rably finetuned on relevant dataset).
-
Pose Matching: Match the estimated pose to the most similar pose from a set of anchor poses.
-
Most Likely Sequence of Anchor Poses: Use the Viterbi algorithm to obtain the most likely sequence of anchor poses given a series of consecutive pose predictions.
- Python3.6+
- Pip
- Virtualenv
- cuDNN7.+
- Docker (optional)
- Create environment and activate (Linux/Mac)
virtualenv -p python3 venv
source venv/bin/activate
- Install requirements
pip install -r requirements.txt
Dataset was annotated with Supervise.ly and exported as JSON. Take a look at lib/dataset/PoseDataset.py for format and naming conventions.
Distributed under the MIT License.
Please review the academic summary for a full list of references and acknowledgements.