Skip to content

Latest commit

 

History

History
76 lines (63 loc) · 1.72 KB

DS.md

File metadata and controls

76 lines (63 loc) · 1.72 KB

Finetuning TVLT on Downstream Task

After data preparation and before running the training script, please modify data_root command in scripts, e.g.,

data_root='./dataset/cmumosei'

CMU-MOSEI

Download CMU-MOSEI [link] meta files and videos and organize the data structures as below

Dataset
│
├── CMU_MOSEI            
│   ├── train
│   │   ├── audio
│   │   ├── video
│   │   └── text
│   ├── valid
│   ├── test
│   ├── labels
│   └── labels_emotion

Sentiment analysis

bash scripts/finetune_mosei.sh

Emotional analysis

bash scripts/finetune_moseiemo.sh

VQAv2

Download VQAv2 [link] meta files and audios [link] and images and organize the data structures as below

Dataset
│
├── VQA      
│   ├── audios
│   ├── train2014
│   │   ├── 0.jpg
│   │   └── ...
│   ├── val2014
│   ├── test2014
│   ├── train.jsonl
│   ├── dev.jsonl
│   └── test.jsonl
bash scripts/finetune_vqa.sh

MSRVTT

Download meta files from frozen-in-time or directly [link] and videos and organize the data structures as below

Dataset
│
├── MSRVTT   
│   ├── high-quality
│   ├── structured-symlinks
│   ├── annotation
│   │   └── MSR_VTT.json
│   ├── videos
│   └── raw_audios
bash scripts/finetune_msrvtt.sh