Prepare Data

Download Dataset

prepare data follow below instruction.

data directory file tree

─── data
    ├── 50salads
    ├── egtea
    ├── gtea
    └── ...

gtea and 50salads and egtea

The video action segmentation model uses egtea, 50salads and gtea data sets.

To download I3D feature from ms-tcn repo.

Dataset tree example

─── gtea
    ├── Videos
    │   ├── S1_Cheese_C1.mp4
    │   ├── S1_Coffee_C1.mp4
    │   ├── S1_CofHoney_C1.mp4
    │   └── ...
    ├── groundTruth
    │   ├── S1_Cheese_C1.txt
    │   ├── S1_Coffee_C1.txt
    │   ├── S1_CofHoney_C1.txt
    │   └── ...
    ├── splits
    │   ├── test.split1.bundle
    │   ├── test.split2.bundle
    │   ├── test.split3.bundle
    │   └── ...
    └── mapping.txt

thumos14 and TVSerious

You can download thumos14 dataset in website and TVSerious in website Thumos14 dataset is temporal action localization dataset.

Dataset tree

─── thumos14
    ├── Videos
    │   ├── video_test_0000896.mp4
    │   ├── video_test_0000897.mp4
    │   ├── video_validation_0000482.mp4
    │   └── ...
    ├── groundTruth
    │   ├── video_test_0000897.txt
    │   ├── video_test_0000897.txt
    │   ├── video_validation_0000482.txt
    │   └── ...
    ├── val_list.txt
    ├── test_list.txt
    └── mapping.txt

Extract Optical Flow(Optional)

python tools/extract/extract_flow.py -c config/extract_flow/extract_optical_flow_fastflownet.yaml -o data/gtea
python tools/extract/extract_flow.py -c config/extract_flow/extract_optical_flow_raft.yaml -o data/gtea
python tools/extract/extract_flow.py -c config/extract_flow/extract_optical_flow_liteflownetv3.yaml -o data/gtea

Extract Feature(Optional)

python tools/extract/extract_features.py -c config/extract_feature/extract_feature_i3d_thumos14.yaml -o data/thumos14

Dataset Normalization

# count mean and std from video
# gtea
python tools/dataset_transform/transform_segmentation_label.py data/gtea data/gtea/groundTruth data/gtea --mode localization --fps 15
python tools/dataset_transform/prepare_video_recognition_data.py data/gtea/label.json data/gtea/Videos data/gtea --negative_sample_num 100 --only_norm True --fps 15 --dataset_type gtea_rgb
python tools/dataset_transform/prepare_video_recognition_data.py data/gtea/label.json data/gtea/flow data/gtea --negative_sample_num 100 --only_norm True --fps 15 --dataset_type gtea_flow

# egtea
python tools/dataset_transform/prepare_video_recognition_data.py data/egtea/egtea.json data/egtea/Videos data/egtea --negative_sample_num 1000 --only_norm True --fps 24 --dataset_type egtea_rgb

# 50salads
python tools/dataset_transform/transform_segmentation_label.py data/50salads data/50salads/groundTruth data/50salads --mode localization --fps 30
python tools/dataset_transform/prepare_video_recognition_data.py data/50salads/label.json data/50salads/Videos data/50salads --negative_sample_num 1000 --only_norm True --fps 30 --dataset_type 50salads_rgb

# thumos14
python tools/dataset_transform/transform_segmentation_label.py data/thumos14/gt.json data/thumos14/Videos data/thumos14 --fps 30

Here releases dataset mean and std config

gtea:

# rgb
mean RGB :[140.39158961711036, 108.18022223151027, 45.72351736766547]
std RGB :[33.94421369129452, 35.93603536756186, 31.508484434367805]
# flows
mean RGB :[0.9686297051020777, 0.9706158002294017, 0.972493270804535] * 255
std RGB :[0.039060756165796726, 0.03689212641350189, 0.03209093941013171] * 255

egtea:

mean RGB ∶[0.47882690412518875, 0.30667687330914223, 0.1764174579795214] * 255
std RGB :[0.26380785444954574, 0.20396220265286277, 0.16305419562005563] * 255

50salads:

mean RGB ∶[0.5139909998345553, 0.5117725498677757，0.4798814301515671] * 255
std RGB :[0.23608918491478523, 0.23385714300069754, 0.23755006337414028] * 255

breakfast:

mean RGB ∶[0.4245283568405083, 0.3904851168609079, 0.33709139617292494] * 255
std RGB :[0.26207845745959846, 0.26008439810422, 0.24623600365905168] * 255

thumos14:

mean RGB ∶[0.384953972862144, 0.38326867429930167, 0.3525199505706894] * 255
std RGB :[0.258450710004705, 0.2544892750057763, 0.24812118173426492] * 255

Convert Localization Label to Segmentation Label

# egtea
python tools/dataset_transform/transform_egtea_label.py data/egtea/splits_label data/egtea/verb_idx.txt data/egtea
python tools/dataset_transform/transform_segmentation_label.py data/egtea/egtea.json data/egtea/Videos data/egtea --mode segmentation --fps 24

For EGTEA we mannul split test set

split1 test

OP01-R01-PastaSalad.mp4
OP01-R02-TurkeySandwich.mp4
OP01-R03-BaconAndEggs.mp4
OP01-R04-ContinentalBreakfast.mp4
OP01-R05-Cheeseburger.mp4
OP01-R06-GreekSalad.mp4
OP01-R07-Pizza.mp4
OP02-R01-PastaSalad.mp4
OP02-R02-TurkeySandwich.mp4
OP02-R03-BaconAndEggs.mp4
OP02-R04-ContinentalBreakfast.mp4
OP02-R05-Cheeseburger.mp4
OP02-R06-GreekSalad.mp4
OP02-R07-Pizza.mp4
P01-R01-PastaSalad.mp4
P01-R02-TurkeySandwich.mp4
P02-R01-PastaSalad.mp4
P02-R03-BaconAndEggs.mp4
P02-R04-ContinentalBreakfast.mp4
P02-R05-Cheeseburger.mp4
P02-R06-GreekSalad.mp4
P03-R01-PastaSalad.mp4
P04-R01-PastaSalad.mp4
P04-R05-Cheeseburger.mp4
P04-R06-GreekSalad.mp4
P05-R01-PastaSalad.mp4
P05-R02-TurkeySandwich.mp4
P06-R01-PastaSalad.mp4
P06-R02-TurkeySandwich.mp4
P07-R01-PastaSalad.mp4

split2 test

OP03-R01-PastaSalad.mp4
OP03-R02-TurkeySandwich.mp4
OP03-R03-BaconAndEggs.mp4
OP03-R04-ContinentalBreakfast.mp4
OP03-R05-Cheeseburger.mp4
OP03-R06-GreekSalad.mp4
OP03-R07-Pizza.mp4
OP04-R01-PastaSalad.mp4
OP04-R02-TurkeySandwich.mp4
OP04-R03-BaconAndEggs.mp4
OP04-R04-ContinentalBreakfast.mp4
OP04-R05-Cheeseburger.mp4
OP04-R06-GreekSalad.mp4
OP04-R07-Pizza.mp4
P08-R01-PastaSalad.mp4
P09-R01-PastaSalad.mp4
P09-R02-TurkeySandwich.mp4
P10-R01-PastaSalad.mp4
P10-R02-TurkeySandwich.mp4
P10-R05-Cheeseburger.mp4
P10-R06-GreekSalad.mp4
P11-R01-PastaSalad.mp4
P11-R02-TurkeySandwich.mp4
P12-R01-PastaSalad.mp4
P12-R02-TurkeySandwich.mp4
P13-R01-PastaSalad.mp4
P14-R01-PastaSalad.mp4
P14-R02-TurkeySandwich.mp4

split3 test

OP05-R03-BaconAndEggs.mp4
OP05-R04-ContinentalBreakfast.mp4
OP05-R07-Pizza.mp4
OP06-R02-TurkeySandwich.mp4
OP06-R03-BaconAndEggs.mp4
OP06-R04-ContinentalBreakfast.mp4
OP06-R05-Cheeseburger.mp4
OP06-R06-GreekSalad.mp4
OP06-R07-Pizza.mp4
P15-R01-PastaSalad.mp4
P16-R03-BaconAndEggs.mp4
P17-R03-BaconAndEggs.mp4
P17-R04-ContinentalBreakfast.mp4
P18-R03-BaconAndEggs.mp4
P18-R04-ContinentalBreakfast.mp4
P19-R03-BaconAndEggs.mp4
P19-R04-ContinentalBreakfast.mp4
P20-R03-BaconAndEggs.mp4
P20-R04-ContinentalBreakfast.mp4
P21-R03-BaconAndEggs.mp4
P21-R04-ContinentalBreakfast.mp4
P21-R05-Cheeseburger.mp4
P21-R06-GreekSalad.mp4
P22-R03-BaconAndEggs.mp4
P23-R03-BaconAndEggs.mp4
P24-R03-BaconAndEggs.mp4
P25-R06-GreekSalad.mp4
P26-R05-Cheeseburger.mp4

Customized Dataset

The easiest way to create a customized dataset is to reuse an existing dataset class, just align the format of the dataset, and then modify the file path to use it.

For example:

Dataset tree example

─── your_dataset
    ├── Videos
    │   ├── S1_Cheese_C1.mp4
    │   ├── S1_Coffee_C1.mp4
    │   ├── S1_CofHoney_C1.mp4
    │   └── ...
    ├── groundTruth
    │   ├── S1_Cheese_C1.txt
    │   ├── S1_Coffee_C1.txt
    │   ├── S1_CofHoney_C1.txt
    │   └── ...
    ├── splits
    │   ├── test.split1.bundle
    │   ├── test.split2.bundle
    │   ├── test.split3.bundle
    │   └── ...
    ├── file_list.txt
    └── mapping.txt

Config

dict(
    name = "FeatureStreamSegmentationDataset",
    data_prefix = "./",
    file_path = "./path/to/your_dataset/file_list.txt",
    feature_path = "./path/to/your_dataset/feature_files.txt",
    gt_path = "./path/to/your_dataset/ground_truth.txt",
    actions_map_file_path = "./path/to/your_dataset/mapping.txt",
    dataset_type = "50salads",
    train_mode = True,
    sliding_window = sliding_window
)

Of course you can build a new dataset class to completely customize the data processing you need.

Follow these steps:

Step 1: The newly constructed dataset needs to inherit the BaseDataset class
Step 2: Modify the DATASETPIPLINE of the model configuration file to meet the required data processing flow
Step 3: After inherit the BaseDataset class, you should register your dataset class
Step 4: Modify the config file to use

Detail in Setp 3

Step 3.1

from svtas.utils import AbstractBuildFactory

@AbstractBuildFactory.register('dataset')
class CustomizedDataset(BaseDataset):
    ...

# or
@AbstractBuildFactory.register('dataset')
class CustomizedDataset(ItemDataset):
    ...

# or
@AbstractBuildFactory.register('dataset')
class CustomizedDataset(StreamDataset):
    ...

Step 3.2 add code in file: svtas\loader\dataset\__init__.py

from .your_customized_dataset import CustomizedDataset

__all__ = [
    "CustomizedDataset"
]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prepare_dataset.md

prepare_dataset.md

Prepare Data

Download Dataset

gtea and 50salads and egtea

thumos14 and TVSerious

Extract Optical Flow(Optional)

Extract Feature(Optional)

Dataset Normalization

Convert Localization Label to Segmentation Label

For EGTEA we mannul split test set

split1 test

split2 test

split3 test

Customized Dataset

Detail in Setp 3

Files

prepare_dataset.md

Latest commit

History

prepare_dataset.md

File metadata and controls

Prepare Data

Download Dataset

gtea and 50salads and egtea

thumos14 and TVSerious

Extract Optical Flow(Optional)

Extract Feature(Optional)

Dataset Normalization

Convert Localization Label to Segmentation Label

For EGTEA we mannul split test set

split1 test

split2 test

split3 test

Customized Dataset

Detail in Setp 3