Releases: chengtan9907/OpenSTL
Releases · chengtan9907/OpenSTL
Weather-5-625-Visualization
We provide visualization figures of various weather prediction methods on Weather Bench (single variable). You can plot your own visualization with tested results (e.g., work_dirs/exp_name/saved
) by vis_video.py. Note that --vis_dirs
denotes visualize all experimental folders under the path, and --vis_channel
can select the channel for visualization. For example, run plotting with the script:
python tools/visualizations/vis_video.py -d weather_t2m_5_625 -w work_dirs/exp_name --index 0 --save_dirs fig_w_t2m_5_625_vis
Video-Visualization
We provide visualization figures of various video prediction methods on various benchmarks. You can plot your own visualization with tested results (e.g., work_dirs/exp_name/saved
) by vis_video.py. Note that --vis_dirs
denotes visualize all experimental folders under the path, and --vis_channel
can select the channel for visualization. For example, run plotting with the script:
python tools/visualizations/vis_video.py -d mmnist -w work_dirs/exp_name --index 0 --save_dirs fig_mmnist_vis
- We provide GIF visualizations of experiments in configs/mmnist for MMNIST (64x64 resolutions).
- We provide GIF visualizations of experiments in configs/mfmnist for Moving FMNIST (64x64 resolutions).
- We provide GIF visualizations of experiments in configs/mmnist_cifar for MMNIST-CIFAR (64x64 resolutions).
- We provide GIF visualizations of experiments in configs/kitticaltech for KittiCaltech (128x160 resolutions).
- We provide GIF visualizations of experiments in configs/kth for KTH Action (128x128 resolutions).
- We provide GIF visualizations of experiments in configs/human for Human 3.6M (256x256 resolutions).
Traffic-Visualization
We provide visualization figures of various traffic prediction methods on various benchmarks. You can plot your own visualization with tested results (e.g., work_dirs/exp_name/saved
) by vis_video.py. Note that --vis_dirs
denotes visualize all experimental folders under the path, and --vis_channel
can select the channel for visualization. For example, run plotting the first channel of TaxiBJ with the script:
python tools/visualizations/vis_video.py -d taxibj -w work_dirs/exp_name --vis_channel 0 --index 0 --save_dirs fig_taxibj_vis
- We provide GIF visualizations of experiments in configs/taxibj for TaxiBJ (32x32 resolutions).
V0.3.0-Weather-5-625-Weights
We provide temperature prediction benchmark results on the popular WeatherBench dataset (temperature prediction t2m
) using $12\rightarrow 12$ frames prediction setting. Metrics (MSE, MAE, SSIM, pSNR) of the best models are reported in three trials. Parameters (M), FLOPs (G), and V100 inference FPS (s) are also reported for all methods. All methods are trained by Adam optimizer with Cosine Annealing scheduler (no warmup and min lr is 1e-6).
STL Benchmarks on Temperature (t2m)
Method |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
RMSE |
Download |
ConvLSTM |
50 epoch |
14.98M |
136G |
46 |
1.521 |
0.7949 |
1.233 |
model | log |
E3D-LSTM |
50 epoch |
51.09M |
169G |
35 |
1.592 |
0.8059 |
1.262 |
model | log |
PhyDNet |
50 epoch |
3.09M |
36.8G |
177 |
285.9 |
8.7370 |
16.91 |
model | log |
PredRNN |
50 epoch |
23.57M |
278G |
22 |
1.331 |
0.7246 |
1.154 |
model | log |
PredRNN++ |
50 epoch |
38.31M |
413G |
15 |
1.634 |
0.7883 |
1.278 |
model | log |
MIM |
50 epoch |
37.75M |
109G |
126 |
1.784 |
0.8716 |
1.336 |
model | log |
MAU |
50 epoch |
5.46M |
39.6G |
237 |
1.251 |
0.7036 |
1.119 |
model | log |
PredRNNv2 |
50 epoch |
23.59M |
279G |
22 |
1.545 |
0.7986 |
1.243 |
model | log |
IncepU (SimVPv1) |
50 epoch |
14.67M |
8.03G |
160 |
1.238 |
0.7037 |
1.113 |
model | log |
gSTA (SimVPv2) |
50 epoch |
12.76M |
7.01G |
504 |
1.105 |
0.6567 |
1.051 |
model | log |
ViT |
50 epoch |
12.41M |
7.99G |
432 |
1.146 |
0.6712 |
1.070 |
model | log |
Swin Transformer |
50 epoch |
12.42M |
6.88G |
581 |
1.143 |
0.6735 |
1.069 |
model | log |
Uniformer |
50 epoch |
12.02M |
7.45G |
465 |
1.204 |
0.6885 |
1.097 |
model | log |
MLP-Mixer |
50 epoch |
11.10M |
5.92G |
713 |
1.255 |
0.7011 |
1.119 |
model | log |
ConvMixer |
50 epoch |
1.13M |
0.95G |
1705 |
1.267 |
0.7073 |
1.126 |
model | log |
Poolformer |
50 epoch |
9.98M |
5.61G |
722 |
1.156 |
0.6715 |
1.075 |
model | log |
ConvNeXt |
50 epoch |
10.09M |
5.66G |
689 |
1.277 |
0.7220 |
1.130 |
model | log |
VAN |
50 epoch |
12.15M |
6.70G |
523 |
1.150 |
0.6803 |
1.072 |
model | log |
HorNet |
50 epoch |
12.42M |
6.84G |
517 |
1.201 |
0.6906 |
1.096 |
model | log |
MogaNet |
50 epoch |
12.76M |
7.01G |
416 |
1.152 |
0.6665 |
1.073 |
model | log |
TAU |
50 epoch |
12.22M |
6.70G |
511 |
1.162 |
0.6707 |
1.078 |
model | log |
STL Benchmarks on Temperature (r)
Read more
V0.3.0-TaxiBJ-Weights
We provide traffic benchmark results on the popular TaxiBJ dataset using $4\rightarrow 4$ frames prediction setting. Metrics (MSE, MAE, SSIM, pSNR) of the best models are reported in three trials. Parameters (M), FLOPs (G), and V100 inference FPS (s) are also reported for all methods. All methods are trained by Adam optimizer with Cosine Annealing scheduler (5 epochs warmup and min lr is 1e-6) and single GPU.
STL Benchmarks on TaxiBJ
Method |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
Download |
ConvLSTM-S |
50 epoch |
14.98M |
20.74G |
815 |
0.3358 |
15.32 |
0.9836 |
39.45 |
model | log |
E3D-LSTM* |
50 epoch |
50.99M |
98.19G |
60 |
0.3427 |
14.98 |
0.9842 |
39.64 |
model | log |
PhyDNet |
50 epoch |
3.09M |
5.60G |
982 |
0.3622 |
15.53 |
0.9828 |
39.46 |
model | log |
PredNet |
50 epoch |
12.5M |
0.85G |
5031 |
0.3516 |
15.91 |
0.9828 |
39.29 |
model | log |
PredRNN |
50 epoch |
23.66M |
42.40G |
416 |
0.3194 |
15.31 |
0.9838 |
39.51 |
model | log |
MIM |
50 epoch |
37.86M |
64.10G |
275 |
0.3110 |
14.96 |
0.9847 |
39.65 |
model | log |
MAU |
50 epoch |
4.41M |
6.02G |
540 |
0.3268 |
15.26 |
0.9834 |
39.52 |
model | log |
PredRNN++ |
50 epoch |
38.40M |
62.95G |
301 |
0.3348 |
15.37 |
0.9834 |
39.47 |
model | log |
PredRNN.V2 |
50 epoch |
23.67M |
42.63G |
378 |
0.3834 |
15.55 |
0.9826 |
39.49 |
model | log |
DMVFN |
50 epoch |
3.54M |
0.057G |
6347 |
3.3954 |
45.52 |
0.8321 |
31.14 |
model | log |
SimVP+IncepU |
50 epoch |
13.79M |
3.61G |
533 |
0.3282 |
15.45 |
0.9835 |
39.45 |
model | log |
SimVP+gSTA-S |
50 epoch |
9.96M |
2.62G |
1217 |
0.3246 |
15.03 |
0.9844 |
39.71 |
model | log |
TAU |
50 epoch |
9.55M |
2.49G |
1268 |
0.3108 |
14.93 |
0.9848 |
39.74 |
model | log |
Benchmark of MetaFormers on SimVP (MetaVP)
MetaFormer |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
Download |
SimVP+IncepU |
50 epoch |
13.79M |
3.61G |
533 |
0.3282 |
15.45 |
0.9835 |
39.45 |
model | log |
SimVP+gSTA-S |
50 epoch |
9.96M |
2.62G |
1217 |
0.3246 |
15.03 |
0.9844 |
39.71 |
model | log |
ViT |
50 epoch |
9.66M |
2.80G |
1301 |
0.3171 |
15.15 |
0.9841 |
39.64 |
model | log |
Swin Transformer |
50 epoch |
9.66M |
2.56G |
1506 |
0.3128 |
15.07 |
0.9847 |
39.65 |
model | log |
Uniformer |
50 epoch |
9.52M |
2.71G |
1333 |
0.3268 |
15.16 |
0.9844 |
39.64 |
model | log |
MLP-Mixer |
50 epoch |
8.24M |
2.18G |
1974 |
0.3206 |
15.37 |
0.9841 |
39.49 |
model | log |
ConvMixer |
50 epoch |
0.84M |
0.23G |
4793 |
0.3634 |
15.63 |
0.9831 |
39.41 |
model | log |
Poolformer |
50 epoch |
7.75M |
2.06G |
1827 |
0.3273 |
15.39 |
0.9840 |
39.46 |
model | log |
ConvNeXt |
50 epoch |
7.84M |
2.08G |
1918 |
0.3106 |
14.90 |
0.9845 |
39.76 |
model | log |
VAN |
50 epoch |
9.48M |
2.49G |
1273 |
0.3125 |
14.96 |
0.9848 |
39.72 |
model | log |
HorNet |
50 epoch |
9.68M |
2.54G |
1350 |
0.3186 |
15.01 |
0.9843 |
39.66 |
model | log |
MogaNet |
50 epoch |
9.96M |
2.61G |
1005 |
0.3114 |
15.06 |
0.9847 |
39.70 |
model | log |
TAU |
50 epoch |
9.55M |
2.49G |
1268 |
0.3108 |
14.93 |
0.9848 |
39.74 |
model | log |
V0.3.0-MMNIST-Weights
We provide benchmark results on the popular Moving MNIST dataset using $10\rightarrow 10$ frames prediction setting following PredRNN. Metrics (MSE, MAE, SSIM, pSNR) of the best models are reported in three trials. Parameters (M), FLOPs (G), and V100 inference FPS (s) are also reported for all methods. All methods are trained by Adam optimizer with Onecycle scheduler and single GPU.
- For a fair comparison of different methods, we provide config files in configs/mmnist.
- We also benchmark popular Metaformer architectures on SimVP with training times of 200-epoch and 2000-epoch. We provide config files in configs/mmnist/simvp.
STL Benchmarks on MMNIST
Method |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
Download |
ConvLSTM-S |
200 epoch |
15.0M |
56.8G |
113 |
29.80 |
90.64 |
0.9288 |
22.10 |
model | log |
ConvLSTM-L |
200 epoch |
33.8M |
127.0G |
50 |
27.78 |
86.14 |
0.9343 |
22.44 |
model | log |
PredNet |
200 epoch |
12.5M |
8.6G |
659 |
161.38 |
201.16 |
0.7783 |
14.33 |
model | log |
PhyDNet |
200 epoch |
3.1M |
15.3G |
182 |
28.19 |
78.64 |
0.9374 |
22.62 |
model | log |
PredRNN |
200 epoch |
23.8M |
116.0G |
54 |
23.97 |
72.82 |
0.9462 |
23.28 |
model | log |
PredRNN++ |
200 epoch |
38.6M |
171.7G |
38 |
22.06 |
69.58 |
0.9509 |
23.65 |
model | log |
MIM |
200 epoch |
38.0M |
179.2G |
37 |
22.55 |
69.97 |
0.9498 |
23.56 |
model | log |
MAU |
200 epoch |
4.5M |
17.8G |
201 |
26.86 |
78.22 |
0.9398 |
22.76 |
model | log |
E3D-LSTM |
200 epoch |
51.0M |
298.9G |
18 |
35.97 |
78.28 |
0.9320 |
21.11 |
model | log |
CrevNet |
200 epoch |
5.0M |
270.7G |
10 |
30.15 |
86.28 |
0.9350 |
|
model | log |
PredRNN.V2 |
200 epoch |
23.9M |
116.6G |
52 |
24.13 |
73.73 |
0.9453 |
23.21 |
model | log |
DMVFN |
200 epoch |
3.5M |
0.2G |
1145 |
123.67 |
179.96 |
0.8140 |
16.15 |
model | log |
SimVP+IncepU |
200 epoch |
58.0M |
19.4G |
209 |
32.15 |
89.05 |
0.9268 |
37.97 |
model | log |
SimVP+gSTA-S |
200 epoch |
46.8M |
16.5G |
282 |
26.69 |
77.19 |
0.9402 |
38.35 |
model | log |
TAU |
200 epoch |
44.7M |
16.0G |
283 |
24.60 |
71.93 |
0.9454 |
23.19 |
model | log |
ConvLSTM-S |
2000 epoch |
15.0M |
56.8G |
113 |
22.41 |
73.07 |
0.9480 |
23.54 |
model | log |
PredNet |
2000 epoch |
12.5M |
8.6G |
659 |
31.85 |
90.01 |
0.9273 |
21.85 |
model | log |
PhyDNet |
2000 epoch |
3.1M |
15.3G |
182 |
20.35 |
61.47 |
0.9559 |
24.21 |
model | log |
PredRNN |
2000 epoch |
23.8M |
116.0G |
54 |
26.43 |
77.52 |
0.9411 |
22.90 |
model | log |
PredRNN++ |
2000 epoch |
38.6M |
171.7G |
38 |
14.07 |
48.91 |
0.9698 |
26.37 |
model | log |
MIM |
2000 epoch |
38.0M |
179.2G |
37 |
14.73 |
52.31 |
0.9678 |
25.99 |
model | log |
MAU |
2000 epoch |
4.5M |
17.8G |
201 |
22.25 |
67.96 |
0.9511 |
23.68 |
model | log |
E3D-LSTM |
2000 epoch |
51.0M |
298.9G |
18 |
24.07 |
77.49 |
0.9436 |
23.19 |
model | log |
PredRNN.V2 |
2000 epoch |
23.9M |
116.6G |
52 |
17.26 |
57.22 |
0.9624 |
25.01 |
model | log |
SimVP+IncepU |
2000 epoch |
58.0M |
19.4G |
209 |
21.15 |
64.15 |
0.9536 |
23.99 |
model | log |
SimVP+gSTA-S |
2000 epoch |
46.8M |
16.5G |
282 |
15.05 |
49.80 |
0.9675 |
25.97 |
model | log |
TAU |
2000 epoch |
44.7M |
16.0G |
283 |
15.69 |
51.46 |
0.9661 |
25.71 |
model | log |
Benchmark of MetaFormers Based on SimVP (MetaVP)
MetaVP |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
Download |
IncepU (SimVPv1) |
200 epoch |
58.0... |
|
|
|
|
|
|
|
Read more
V0.3.0-MMNIST-CIFAR-Weights
Similar to Moving MNIST, we further design the advanced version of MNIST with complex backgrounds from CIFAR-10, i.e., MMNIST-CIFAR benchmark, using $10\rightarrow 10$ frames prediction setting following PredRNN. Metrics (MSE, MAE, SSIM, pSNR) of the best models are reported in three trials. Parameters (M), FLOPs (G), and V100 inference FPS (s) are also reported for all methods. All methods are trained by Adam optimizer with Onecycle scheduler and single GPU.
STL Benchmarks on MMNIST-CIFAR
Method |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
Download |
ConvLSTM-S |
200 epoch |
15.5M |
58.8G |
113 |
73.31 |
338.56 |
0.9204 |
23.09 |
model | log |
ConvLSTM-L |
200 epoch |
34.4M |
130.0G |
50 |
62.86 |
291.05 |
0.9337 |
23.83 |
model | log |
PredNet |
200 epoch |
12.5M |
8.6G |
945 |
286.70 |
514.14 |
0.8139 |
17.49 |
model | log |
PhyDNet |
200 epoch |
3.1M |
15.3G |
182 |
142.54 |
700.37 |
0.8276 |
19.92 |
model | log |
PredRNN |
200 epoch |
23.8M |
116.0G |
54 |
50.09 |
225.04 |
0.9499 |
24.90 |
model | log |
PredRNN++ |
200 epoch |
38.6M |
171.7G |
38 |
44.19 |
198.27 |
0.9567 |
25.60 |
model | log |
MIM |
200 epoch |
38.8M |
183.0G |
37 |
48.63 |
213.44 |
0.9521 |
25.08 |
model | log |
MAU |
200 epoch |
4.5M |
17.8G |
201 |
58.84 |
255.76 |
0.9408 |
24.19 |
model | log |
E3D-LSTM |
200 epoch |
52.8M |
306.0G |
18 |
80.79 |
214.86 |
0.9314 |
22.89 |
model | log |
PredRNN.V2 |
200 epoch |
23.9M |
116.6G |
52 |
57.27 |
252.29 |
0.9419 |
24.24 |
model | log |
DMVFN |
200 epoch |
3.6M |
0.2G |
960 |
298.73 |
606.92 |
0.7765 |
17.07 |
model | log |
SimVP+IncepU |
200 epoch |
58.0M |
19.4G |
209 |
59.83 |
214.54 |
0.9414 |
24.15 |
model | log |
SimVP+gSTA-S |
200 epoch |
46.8M |
16.5G |
282 |
51.13 |
185.13 |
0.9512 |
24.93 |
model | log |
TAU |
200 epoch |
44.7M |
16.0G |
275 |
48.17 |
177.35 |
0.9539 |
25.21 |
model | log |
Benchmark of MetaFormers Based on SimVP (MetaVP)
MetaFormer |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
Download |
IncepU (SimVPv1) |
200 epoch |
58.0M |
19.4G |
209 |
59.83 |
214.54 |
0.9414 |
24.15 |
model | log |
gSTA (SimVPv2) |
200 epoch |
46.8M |
16.5G |
282 |
51.13 |
185.13 |
0.9512 |
24.93 |
model | log |
ViT |
200 epoch |
46.1M |
16.9G |
290 |
64.94 |
234.01 |
0.9354 |
23.90 |
model | log |
Swin Transformer |
200 epoch |
46.1M |
16.4G |
294 |
57.11 |
207.45 |
0.9443 |
24.34 |
model | log |
Uniformer |
200 epoch |
44.8M |
16.5G |
296 |
56.96 |
207.51 |
0.9442 |
24.38 |
model | log |
MLP-Mixer |
200 epoch |
38.2M |
14.7G |
334 |
57.03 |
206.46 |
0.9446 |
24.34 |
model | log |
ConvMixer |
200 epoch |
3.9M |
5.5G |
658 |
59.29 |
219.76 |
0.9403 |
24.17 |
model | log |
Poolformer |
200 epoch |
37.1M |
14.1G |
341 |
60.98 |
219.50 |
0.9399 |
24.16 |
model | log |
ConvNeXt |
200 epoch |
37.3M |
14.1G |
344 |
51.39 |
187.17 |
0.9503 |
24.89 |
model | log |
VAN |
200 epoch |
44.5M |
16.0G |
288 |
59.59 |
221.32 |
0.9398 |
25.20 |
model | log |
HorNet |
200 epoch |
45.7M |
16.3G |
287 |
55.79 |
202.73 |
0.9456 |
24.49 |
model | log |
MogaNe... |
|
|
|
|
|
|
|
|
|
Read more
V0.3.0-MFMNIST-Weights
Similar to Moving MNIST, we also provide the advanced version of MNIST, i.e., MFMNIST benchmark results, using $10\rightarrow 10$ frames prediction setting following PredRNN. Metrics (MSE, MAE, SSIM, pSNR) of the best models are reported in three trials. Parameters (M), FLOPs (G), and V100 inference FPS (s) are also reported for all methods. All methods are trained by Adam optimizer with Onecycle scheduler and single GPU.
- For a fair comparison of different methods, we provide config files in configs/mfmnist.
- We also benchmark popular Metaformer architectures on SimVP with training times of 200 epochs. We provide config files in configs/mfmnist/simvp.
STL Benchmarks on MFMNIST
Method |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
Download |
ConvLSTM-S |
200 epoch |
15.0M |
56.8G |
113 |
28.87 |
113.20 |
0.8793 |
22.07 |
model | log |
ConvLSTM-L |
200 epoch |
33.8M |
127.0G |
50 |
25.51 |
104.85 |
0.8928 |
22.67 |
model | log |
PredNet |
200 epoch |
12.5M |
8.6G |
659 |
185.94 |
318.30 |
0.6713 |
14.83 |
model | log |
PhyDNet |
200 epoch |
3.1M |
15.3G |
182 |
34.75 |
125.66 |
0.8567 |
22.03 |
model | log |
PredRNN |
200 epoch |
23.8M |
116.0G |
54 |
22.01 |
91.74 |
0.9091 |
23.42 |
model | log |
PredRNN++ |
200 epoch |
38.6M |
171.7G |
38 |
21.71 |
91.97 |
0.9097 |
23.45 |
model | log |
MIM |
200 epoch |
38.0M |
179.2G |
37 |
23.09 |
96.37 |
0.9043 |
23.13 |
model | log |
MAU |
200 epoch |
4.5M |
17.8G |
201 |
26.56 |
104.39 |
0.8916 |
22.51 |
model | log |
E3D-LSTM |
200 epoch |
51.0M |
298.9G |
18 |
35.35 |
110.09 |
0.8722 |
21.27 |
model | log |
PredRNN.V2 |
200 epoch |
23.9M |
116.6G |
52 |
24.13 |
97.46 |
0.9004 |
22.96 |
model | log |
DMVFN |
200 epoch |
3.5M |
0.2G |
1145 |
118.32 |
220.02 |
0.7572 |
16.76 |
model | log |
SimVP+IncepU |
200 epoch |
58.0M |
19.4G |
209 |
30.77 |
113.94 |
0.8740 |
21.81 |
model | log |
SimVP+gSTA-S |
200 epoch |
46.8M |
16.5G |
282 |
25.86 |
101.22 |
0.8933 |
22.61 |
model | log |
TAU |
200 epoch |
44.7M |
16.0G |
283 |
24.24 |
96.72 |
0.8995 |
22.87 |
model | log |
Benchmark of MetaFormers Based on SimVP (MetaVP)
MetaFormer |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
Download |
IncepU (SimVPv1) |
200 epoch |
58.0M |
19.4G |
209 |
30.77 |
113.94 |
0.8740 |
21.81 |
model | log |
gSTA (SimVPv2) |
200 epoch |
46.8M |
16.5G |
282 |
25.86 |
101.22 |
0.8933 |
22.61 |
model | log |
ViT |
200 epoch |
46.1M |
16.9.G |
290 |
31.05 |
115.59 |
0.8712 |
21.83 |
model | log |
Swin Transformer |
200 epoch |
46.1M |
16.4G |
294 |
28.66 |
108.93 |
0.8815 |
22.08 |
model | log |
Uniformer |
200 epoch |
44.8M |
16.5G |
296 |
29.56 |
111.72 |
0.8779 |
21.97 |
model | log |
MLP-Mixer |
200 epoch |
38.2M |
14.7G |
334 |
28.83 |
109.51 |
0.8803 |
22.01 |
model | log |
ConvMixer |
200 epoch |
3.9M |
5.5G |
658 |
31.21 |
115.74 |
0.8709 |
21.71 |
model | log |
Poolformer |
200 epoch |
37.1M |
14.1G |
341 |
30.02 |
113.07 |
0.8750 |
21.95 |
model | log |
ConvNeXt |
200 epoch |
37.3M |
14.1G |
344 |
26.41 |
102.56 |
0.8908 |
22.49 |
model | log |
VAN |
200 epoch |
44.5M |
16.0G |
288 |
31.39 |
116.28 |
0.8703 |
22.82 |
model | log |
HorNet |
200 epoch |
45.7M |
16.3G |
287 |
29.19 |
110.17 |
0.8796 |
22.03 |
model | log |
MogaNet |
200 epoch |
46.8M |
16.5G |
255 |
25.14 |
99.69 |
0.8960 |
22.73 |
model | log |
TAU |
200 epoch |
44.7M |
16.0G |
283 |
24.24 |
96.72 |
0.8995 |
22.87 |
[model](https://github.com/chengtan9907/OpenSTL/releases/download/m... |
Read more
V0.3.0-KTH20-Weights
We provide long-term prediction benchmark results on KTH Action dataset using $10\rightarrow 20$ frames prediction setting. Metrics (MSE, MAE, SSIM, pSNR, LPIPS) of the best models are reported in three trials. Parameters (M), FLOPs (G), and V100 inference FPS (s) are also reported for all methods. The default training setup is trained 100 epochs by Adam optimizer with a batch size of 16 and Onecycle scheduler on single GPU or 4GPUs, and we report the used GPU setups for each method (also shown in the config).
- For a fair comparison of different methods, we provide config files in configs/kth. Notice that
4xbs4
indicates 4GPUs DDP training with a batch size of 4 on each GPU.
- We provide config files in configs/kth/simvp.
STL Benchmarks on KTH
Method |
GPUs |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
LPIPS |
Download |
ConvLSTM |
1xbs16 |
14.9M |
1368.0G |
16 |
47.65 |
445.5 |
0.8977 |
26.99 |
0.26686 |
model | log |
E3D-LSTM |
2xbs8 |
53.5M |
217.0G |
17 |
136.40 |
892.7 |
0.8153 |
21.78 |
0.48358 |
model | log |
PredNet |
1xbs16 |
12.5M |
3.4G |
399 |
152.11 |
783.1 |
0.8094 |
22.45 |
0.32159 |
model | log |
PhyDNet |
1xbs16 |
3.1M |
93.6G |
58 |
91.12 |
765.6 |
0.8322 |
23.41 |
0.50155 |
model | log |
MAU |
1xbs16 |
20.1M |
399.0G |
8 |
51.02 |
471.2 |
0.8945 |
26.73 |
0.25442 |
model | log |
MIM |
1xbs16 |
39.8M |
1099.0G |
17 |
40.73 |
380.8 |
0.9025 |
27.78 |
0.18808 |
model | log |
PredRNN |
1xbs16 |
23.6M |
2800.0G |
7 |
41.07 |
380.6 |
0.9097 |
27.95 |
0.21892 |
model | log |
PredRNN++ |
1xbs16 |
38.3M |
4162.0G |
5 |
39.84 |
370.4 |
0.9124 |
28.13 |
0.19871 |
model | log |
PredRNN.V2 |
1xbs16 |
23.6M |
2815.0G |
7 |
39.57 |
368.8 |
0.9099 |
28.01 |
0.21478 |
model | log |
DMVFN |
1xbs16 |
3.5M |
0.88G |
727 |
59.61 |
413.2 |
0.8976 |
26.65 |
0.12842 |
model | log |
SimVP+IncepU |
2xbs8 |
12.2M |
62.8G |
77 |
41.11 |
397.1 |
0.9065 |
27.46 |
0.26496 |
model | log |
SimVP+gSTA |
4xbs4 |
15.6M |
76.8G |
53 |
45.02 |
417.8 |
0.9049 |
27.04 |
0.25240 |
model | log |
TAU |
4xbs4 |
15.0M |
73.8G |
55 |
45.32 |
421.7 |
0.9086 |
27.10 |
0.22856 |
model | log |
Benchmark of MetaFormers Based on SimVP (MetaVP)
MetaFormer |
GPUs |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
LPIPS |
Download |
IncepU (SimVPv1) |
2xbs8 |
12.2M |
62.8G |
77 |
41.11 |
397.1 |
0.9065 |
27.46 |
0.26496 |
model | log |
gSTA (SimVPv2) |
2xbs8 |
15.6M |
76.8G |
53 |
45.02 |
417.8 |
0.9049 |
27.04 |
0.25240 |
model | log |
ViT |
2xbs8 |
12.7M |
112.0G |
28 |
56.57 |
459.3 |
0.8947 |
26.19 |
0.27494 |
model | log |
Swin Transformer |
2xbs8 |
15.3M |
75.9G |
65 |
45.72 |
405.7 |
0.9039 |
27.01 |
0.25178 |
model | log |
Uniformer |
2xbs8 |
11.8M |
78.3G |
43 |
44.71 |
404.6 |
0.9058 |
27.16 |
0.24174 |
model | log |
MLP-Mixer |
2xbs8 |
20.3M |
66.6G |
34 |
57.74 |
517.4 |
0.8886 |
25.72 |
0.28799 |
model | log |
ConvMixer |
2xbs8 |
1.5M |
18.3G |
175 |
47.31 |
446.1 |
0.8993 |
26.66 |
0.28149 |
model | log |
Poolformer |
2xbs8 |
12.4M |
63.6G |
67 |
45.44 |
400.9 |
0.9065 |
27.22 |
0.24763 |
model | log |
ConvNeXt |
2xbs8 |
12.5M |
63.9G |
72 |
45.48 |
428.3 |
0.9037 |
26.96 |
0.26253 |
model | log |
VAN |
2xbs8 |
14.9M |
73.8G |
55 |
45.05 |
409.1 |
0.9074 |
27.07 |
0.23116 |
model | log |
HorNet |
2xbs8 |
15.3M |
75.3G |
58 |
46.84 |
421.2 |
0.9005 |
26.80 |
0.26921 |
model | log |
MogaNet |
2xbs8 |
15.6M |
76.7G |
48 |
42.98 |
418.7 |
0.9065 |
27.16 |
0.25146 |
model | log |
TAU |
2xbs8 |
15.0M |
73.8G |
55 |
45.32 |
421.7 |
0.9086 |
27.10 |
0.22856 |
model | log |
V0.3.0-KITTICaltech-Weights
We provide benchmark results on KittiCaltech Pedestrian dataset using $10\rightarrow 1$ frames prediction setting following PredNet. Metrics (MSE, MAE, SSIM, pSNR, LPIPS) of the best models are reported in three trials. Parameters (M), FLOPs (G), and V100 inference FPS (s) are also reported for all methods. The default training setup is trained 100 epochs by Adam optimizer with Onecycle scheduler on single GPU, while some computational consuming methods (denoted by *) using 4GPUs.
STL Benchmarks on KittiCaltech
Method |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
LPIPS |
Download |
ConvLSTM-S |
100 epoch |
15.0M |
595.0G |
33 |
139.6 |
1583.3 |
0.9345 |
27.46 |
0.08575 |
model | log |
E3D-LSTM* |
100 epoch |
54.9M |
1004G |
10 |
200.6 |
1946.2 |
0.9047 |
25.45 |
0.12602 |
model | log |
PredNet |
100 epoch |
12.5M |
42.8G |
94 |
159.8 |
1568.9 |
0.9286 |
27.21 |
0.11289 |
model | log |
PhyDNet |
100 epoch |
3.1M |
40.4G |
117 |
312.2 |
2754.8 |
0.8615 |
23.26 |
0.32194 |
model | log |
MAU |
100 epoch |
24.3M |
172.0G |
16 |
177.8 |
1800.4 |
0.9176 |
26.14 |
0.09673 |
model | log |
MIM |
100 epoch |
49.2M |
1858G |
39 |
125.1 |
1464.0 |
0.9409 |
28.10 |
0.06353 |
model | log |
PredRNN |
100 epoch |
23.7M |
1216G |
17 |
130.4 |
1525.5 |
0.9374 |
27.81 |
0.07395 |
model | log |
PredRNN++ |
100 epoch |
38.5M |
1803G |
12 |
125.5 |
1453.2 |
0.9433 |
28.02 |
0.13210 |
model | log |
PredRNN.V2 |
100 epoch |
23.8M |
1223G |
52 |
147.8 |
1610.5 |
0.9330 |
27.12 |
0.08920 |
model | log |
DMVFN |
100 epoch |
3.6M |
1.2G |
557 |
183.9 |
1531.1 |
0.9314 |
26.95 |
0.04942 |
model | log |
SimVP+IncepU |
100 epoch |
8.6M |
60.6G |
57 |
160.2 |
1690.8 |
0.9338 |
26.81 |
0.06755 |
model | log |
SimVP+gSTA-S |
100 epoch |
15.6M |
96.3G |
40 |
129.7 |
1507.7 |
0.9454 |
27.89 |
0.05736 |
model | log |
TAU |
100 epoch |
44.7M |
80.0G |
55 |
131.1 |
1507.8 |
0.9456 |
27.83 |
0.05494 |
model | log |
Benchmark of MetaFormers Based on SimVP (MetaVP)
MetaFormer |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
LPIPS |
Download |
IncepU (SimVPv1) |
100 epoch |
8.6M |
60.6G |
57 |
160.2 |
1690.8 |
0.9338 |
26.81 |
0.06755 |
model | log |
gSTA (SimVPv2) |
100 epoch |
15.6M |
96.3G |
40 |
129.7 |
1507.7 |
0.9454 |
27.89 |
0.05736 |
model | log |
ViT* |
100 epoch |
12.7M |
155.0G |
25 |
146.4 |
1615.8 |
0.9379 |
27.43 |
0.06659 |
model | log |
Swin Transformer |
100 epoch |
15.3M |
95.2G |
49 |
155.2 |
1588.9 |
0.9299 |
27.25 |
0.08113 |
model | log |
Uniformer* |
100 epoch |
11.8M |
104.0G |
28 |
135.9 |
1534.2 |
0.9393 |
27.66 |
0.06867 |
model | log |
MLP-Mixer |
100 epoch |
22.2M |
83.5G |
60 |
207.9 |
1835.9 |
0.9133 |
26.29 |
0.07750 |
model | log |
ConvMixer |
100 epoch |
1.5M |
23.1G |
129 |
174.7 |
1854.3 |
0.9232 |
26.23 |
0.07758 |
model | log |
Poolformer |
100 epoch |
12.4M |
79.8G |
51 |
153.4 |
1613.5 |
0.9334 |
27.38 |
0.07000 |
model | log |
ConvNeXt |
100 epoch |
12.5M |
80.2G |
54 |
146.8 |
1630.0 |
0.9336 |
27.19 |
0.06987 |
model | log |
VAN |
100 epoch |
14.9M |
92.5G |
41 |
127.5 |
1476.5 |
0.9462 |
27.98 |
0.05500 |
model | log |
HorNet |
100 epoch |
15.3M |
94.4G |
43 |
152.8 |
1637.9 |
0.9365 |
27.09 |
0.06004 |
model | log |
MogaNet |
100 epoch |
15.6M |
96.2G |
36 |
131.4 |
1512.1 |
0.9442 |
27.79 |
0.05394 |
model | log |
TAU |
100 epoch |
44.7M |
80.0G |
55 |
131.1 |
1507.... |
|
|
|
|
Read more