To participate and submit to this challenge, register at the EPIC-SOUNDS Audio-Based Interaction Recognition Codalab Challenge.
The labelled train/val annoations, along with the recognition test set timestamps are available on the EPIC-Sounds annotations repo. The baseline models can also be found here, where the inference script src/tools/test_net.py
can be used as a template to correctly format models scores for the create_submission.py
and evaluate.py
scripts.
This repo is a modified version of the existing Action Recognition Challenge.
NOTE: For this version of the challenge (version "0.1"), the class "background" (class_id=13) has been redacted from the test set. The argument --redact_background
is supported in evaluate.py
to remove background labels from your validation set evaluation.
We support two formats for model results.
- List format:
[
{
'interaction_output': Iterable of float, shape [44],
'annotation_id': str, e.g. 'P01_101_1'
}, ... # repeated for all segments in the val/test set.
]
- Dict format:
{
'interaction_output': np.ndarray of float32, shape [N, 44],
'annotation_id': np.ndarray of str, shape [N,]
}
Either of these formats can saved via torch.save
with .pt
or .pyth
suffix or with
pickle.dump
with a .pkl
suffix.
Note that either of these layouts can be stored in a .pkl
/.pt
file--the dict
format doesn't necessarily have to be in a .pkl
.
We provide an evaluation script to compute the metrics we report in the paper on the validation set. You will also need to clone the annotations repo.