Sequential Image Classification of Human-Robot Walking Environments using Temporal Neural Networks by Bohdan Ivaniuk-Skulskyi, Andrew Garrett Kurbis, Alex Mihailidis, and Brokoslaw Laschowski
05/05/2023
- Our absrtact accepted to ICRA 2023 Computer Vision for Wearable Robotics Workshop
19/04/2023
- Our poster accepted to ICAIR 2023
Create virtual environment and run requirements.txt
python3 -m venv venv
source venv/bin/activate
pip install git+https://github.com/Atze00/MoViNet-pytorch.git
pip install -r requirements.txt
Install StairNet dataset and run preprocessing file
python data_preprocessing/dataset_preprocessing.py --data_folder /path-to-dataset-dir/
Define os environment variable pointing to it
export DATASET=/path-to-preprocessed-dataset-dir/
- unzip
data_splits/train.txt
file.
Name | Parameters | GFLOPs | Resolution | Accuracy | F1-score | Download | |
---|---|---|---|---|---|---|---|
MoViNet | 4.03M | 2.5 | 5x3x224x224 | 0.983 | 0.982 | model | config |
MobileViT-LSTM | 3.36M | 9.84 | 5x3x224x224 | 0.970 | 0.968 | model | config |
MobileNet-LSTM | 6.08M | 53.96 | 5x3x224x224 | 0.973 | 0.970 | model | config |
MobileNet-LSTM (seq2seq) | 5.93M | 50.97 | 5x3x224x224 | 0.707 | 0.799 | model | config |
Baseline (Kurbis et al.) | 2.26M | 0.61 | 3x224x224 | 0.972 | 0.972 | - | - |
Download one checkpoint with its configuration file and run the following command
python test.py --experiment_cfg CONFIG.yaml \
--dataset_folder $DATASET \
--val_samples_file data_splits/validation.txt \
--test_samples_file data_splits/test.txt \
--checkpoint_path CHECKPOINT.pth
python train.py --experiment_cfg CONFIG.yaml \
--dataset_folder $DATASET \
--train_samples_file data_splits/train.txt \
--val_samples_file data_splits/validation.txt \
--test_samples_file data_splits/test.txt
@unknown{unknown,
author = {Ivanyuk, Bogdan and Kurbis, A. Garrett and Mihailidis, Alex and Laschowski, Brokoslaw},
year = {2023},
month = {05},
pages = {},
title = {Sequential Image Classification of Human-Robot Walking Environments using Temporal Neural Networks}
}
The visual encoder models are taken from timm library and the MoViNet is based on this implementation