Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

Commit

Permalink
fix conflicts.
Browse files Browse the repository at this point in the history
  • Loading branch information
pangge committed Jul 3, 2018
2 parents f7383ee + 275c28e commit 7856530
Show file tree
Hide file tree
Showing 8 changed files with 139 additions and 118 deletions.
27 changes: 27 additions & 0 deletions AUTHORS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
| Github account | name |
|---|---|
| chenjiaoAngel | Jiao Chen |
| cyj1986 | Yujuan Cheng |
| feifei14119 | Fei Wang |
| jackyh | Chengjie He |
| Jayoprell | Xiaocheng Luo |
| jjsbear | Jingsong Ji |
| LittleMaer | Yi Zhuang |
| mengkai94 | Kai Meng |
| micytw | Michael Wu |
| pangge | Chaowen Cui |
| perchbird | Xiaokun Yu |
| PeterJkPeng | Junyi Peng |
| qq332982511 | Junjie Liu |
| Shixiaowei02 | Xiaowei Shi |
| sogalin | Soga Lin |
| throneclay | Shuai Zhang |
| vin-huang | Vin Huang |
| wgy0804 | Guoya Wang |
| xklnono | Kailu Xu |
| xyoungli | Xiaoyang Li |
| yanan1112 | Yanan Liu |
| yao-matrix | Weifeng Yao |
| zdcocnftcp10 | Dachuan Zhao |
| zhouhuan2009 | Huan Zhou |
| zoooooooyuan | Yuan Zu |
73 changes: 49 additions & 24 deletions benchmark/README_CPU.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@

This time, we only provide benchmark on CPU. In the near future, we will add benchmark on ARM and GPU.

> System: `CentOS 7 in Docker`, for benchmark between Anakin and Tensorflow
> System: `CentOS 6.3`, for benchmark between Anakin and Paddle
## Counterpart of anakin :

The counterpart of **`Anakin`** is `Tensorflow 1.8.0`, which installed by Anaconda 4.5.4, run by Python 3.6
Expand Down Expand Up @@ -202,55 +205,77 @@ We tested them on single-CPU with different thread numbers.
4 | 18074 | 118696
6 | 26607 | 102044

2. **`Anakin`** VS **`PaddlePaddle/Fluid\`**

2. **`Anakin`** VS **`PaddlePaddle/Fluid`**
We use private dataset and different QPS index in this benchmark.
### <span id = '1'>language model in E5-2650 v4 </span>

- Latency (`ms`) of one batch

ThreadNum | Fluid | Anakin
:---: | :---: | :---: |
1 | 42.09 | 1.90
2 | 42.14 | 2.16
6 | 42.15 | 4.21
10 | 42.14 | 9.26
12 | 42.34 | 11.17
1 | 42.7418 | 1.93589
2 | 42.7418 | 2.49537
6 | 42.7734 | 3.14332
10 | 43.0721 | 4.55329
12 | 42.8501 | 5.09893

- Throughput (`sentence/s`)

ThreadNum | Fluid | Anakin
:---: | :---: | :---: |
1 | 23 | 524
2 | 47 | 916
6 | 141 | 1402
10 | 236 | 1063
12 | 282 | 1044
1 | 23 | 504
2 | 46 | 762
6 | 134 | 1393
10 | 218 | 1556
12 | 260 | 1541

### <span id = '2'>Chinese_ner model in E5-2650 v4 </span>

- Latency (`ms`) of one batch

ThreadNum | Fluid | Anakin
:---: | :---: | :---: |
1 | 0.47 | 0.17
4 | 0.26 | 0.17
6 | 0.36 | 0.17
10 | 0.59 | 0.17
12 | 0.72 | 0.17
1 | 0.380475 | 0.17034
4 | 0.380475 | 0.171143
6 | 0.380475 | 0.172688
10 | 0.380475 | 0.173269
12 | 0.380475 | 0.17668

- Throughput (`sentence/s`)

ThreadNum | Fluid | Anakin
:---: | :---: | :---: |
1 | 7844 | 5822
4 | 7844 | 11377
6 | 7844 | 29725
10 | 7844 | 41238
12 | 7844 | 42790

### <span id = '3'>text_classfication model in E5-2650 v4 </span>

- Latency (`ms`) of one batch

ThreadNum | Fluid | Anakin
:---: | :---: | :---: |
1 | 1.48578 | 1.10088
4 | 1.54025 | 1.11258
6 | 1.68529 | 1.1257
10 | 1.9817 | 1.13267
12 | 2.21864 | 1.1429

- Throughput (`sentence/s`)

ThreadNum | Fluid | Anakin
:---: | :---: | :---: |
1 | 2129 | 5819
4 | 3866 | 11182
6 | 8095 | 30948
10 | 8250 | 44093
12 | 8112 | 47185
1 | 673 | 901
4 | 1289 | 1665
6 | 3458 | 4449
10 | 4875 | 6183
12 | 5265 | 6188

## How to run those Benchmark models?

> 1. You can just run `sh benchmark_tensorflow.sh` and `sh benchmark_anakin.sh`
> 2. Get the model of caffe or fluid, convert model to anakin model, use net_test_*** to test your model.
> 1. You can just run `sh benchmark_tensorflow.sh` and `sh benchmark_anakin.sh`
> 2. Get the model of caffe or fluid, convert model to anakin model, use net_test_*** to test your model.

110 changes: 55 additions & 55 deletions benchmark/README_GPU.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,21 +35,21 @@ We tested them on single-GPU with single-thread.

BatchSize | TensorRT | Anakin
:---: | :---: | :---: |
1 | 8.8690 | 8.2815
2 | 15.5344 | 13.9116
4 | 26.6000 | 21.8747
8 | 49.8279 | 40.4076
32 | 188.6270 | 163.7660
1 | 8.85176 | 8.15362
2 | 15.6517 | 13.8716
4 | 26.5303 | 21.8478
8 | 48.2286 | 40.496
32 | 183.994 | 163.035

- GPU Memory Used (`MB`)

BatchSize | TensorRT | Anakin
:---: | :---: | :---: |
1 | 963 | 997
2 | 965 | 1039
4 | 991 | 1115
8 | 1067 | 1269
32 | 1715 | 2193
1 | 887 | 648
2 | 965 | 733
4 | 991 | 810
8 | 1067 | 911
32 | 1715 | 1325


### <span id = '2'>Yolo </span>
Expand All @@ -58,100 +58,100 @@ We tested them on single-GPU with single-thread.

BatchSize | TensorRT | Anakin
:---: | :---: | :---: |
1 | 16.4596| 15.2124
2 | 26.6347| 25.0442
4 | 43.3695| 43.5017
8 | 80.9139 | 80.9880
32 | 293.8080| 310.8810
1 | 16.4623| 15.3214
2 | 26.7082| 25.0305
4 | 43.2129| 43.4758
8 | 80.0053 | 80.7645
32 | 283.352| 311.152

- GPU Memory Used (`MB`)


BatchSize | TensorRT | Anakin
:---: | :---: | :---: |
1 | 1569 | 1775
2 | 1649 | 1815
4 | 1709 | 1887
8 | 1731 | 2031
32 | 2253 | 2907
1 | 1226 | 1192
2 | 1326 | 1269
4 | 1435 | 1356
8 | 1563 | 1434
32 | 2150 | 1633

### <span id = '3'> Resnet50 </span>

- Latency (`ms`) of different batch

BatchSize | TensorRT | Anakin
:---: | :---: | :---: |
1 | 4.2459 | 4.1061
2 | 6.2627 | 6.5159
4 | 10.1277 | 11.3327
8 | 17.8209 | 20.6680
32 | 65.8582 | 77.8858
1 | 4.26834 | 3.25853
2 | 6.2811 | 6.12156
4 | 10.1183 | 10.9219
8 | 18.1395 | 20.323
32 | 66.4728 | 83.9934

- GPU Memory Used (`MB`)

BatchSize | TensorRT | Anakin
:---: | :---: | :---: |
1 | 531 | 503
2 | 543 | 517
4 | 583 | 541
8 | 611 | 589
32 | 809 | 879
1 | 932 | 272
2 | 936 | 318
4 | 720 | 376
8 | 697 | 480
32 | 842 | 835

### <span id = '4'> Resnet101 </span>

- Latency (`ms`) of different batch

BatchSize | TensorRT | Anakin
:---: | :---: | :---: |
1 | 7.5562 | 7.0837
2 | 11.6023 | 11.4079
4 | 18.3650 | 20.0493
8 | 32.7632 | 36.0648
32 | 123.2550 | 135.4880
1 | 7.58234 | 5.66457
2 | 11.6014 | 10.9213
4 | 18.3298 | 19.3987
8 | 32.6523 | 37.5575
32 | 123.114 | 149.089

- GPU Memory Used (`MB)`

BatchSize | TensorRT | Anakin
:---: | :---: | :---: |
1 | 701 | 683
2 | 713 | 697
4 | 793 | 721
8 | 819 | 769
32 | 1043 | 1059
1 | 1020 | 420
2 | 961 | 467
4 | 943 | 503
8 | 885 | 606
32 | 1048 | 1077

### <span id = '5'> MobileNet V1 </span>

- Latency (`ms`) of different batch

BatchSize | TensorRT | Anakin
:---: | :---: | :---: |
1 | 45.5156 | 1.3947
2 | 46.5585 | 2.5483
4 | 48.4242 | 4.3404
8 | 52.7957 | 8.1513
32 | 83.2519 | 31.3178
1 | 45.2189 | 1.39566
2 | 46.4538 | 2.50698
4 | 47.8918 | 4.38727
8 | 52.3636 | 8.21416
32 | 83.0503 | 31.33

- GPU Memory Used (`MB`)

BatchSize | TensorRT | Anakin
:---: | :---: | :---: |
1 | 329 | 283
2 | 345 | 289
4 | 371 | 299
8 | 393 | 319
32 | 531 | 433
1 | 516 | 176
2 | 524 | 166
4 | 497 | 165
8 | 508 | 239
32 | 628 | 388

### <span id = '6'> MobileNet V2</span>

- Latency (`ms`) of different batch

BatchSize | TensorRT | Anakin
:---: | :---: | :---: |
1 | 65.6861 | 2.9842
2 | 66.6814 | 4.7472
4 | 69.7114 | 7.4163
8 | 76.1092 | 12.8779
32 | 124.9810 | 47.2142
1 | 65.4277 | 1.80542
2 | 66.2048 | 3.85568
4 | 68.8045 | 6.80921
8 | 75.64 | 12.6038
32 | 124.09 | 47.6079

- GPU Memory Used (`MB`)

Expand Down
4 changes: 4 additions & 0 deletions saber/funcs/impl/cuda/base/cuda_c/saber_scale.cu
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,10 @@ SaberStatus SaberScale<NV, AK_FLOAT, AK_FLOAT, AK_FLOAT, \
auto in_data = inputs[0]->data();
auto out_data = outputs[0]->mutable_data();
const int count = inputs[0]->valid_size();
if (inputs.size() > 1) {
_scale_dim = inputs[1]->valid_size();
_inner_dim = count / _scale_dim;
}
if (_scale_dim > 1 || inputs.size() > 1) {
auto scale_data = inputs.size() > 1 ? inputs[1]->data() : _weight.data();
auto bias_data = param.bias_term ? _bias.data() : NULL;
Expand Down
3 changes: 1 addition & 2 deletions test/framework/graph/graph_parser_from_model_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,7 @@
using namespace anakin;
using namespace anakin::graph;

//std::string model_path = "/home/chaowen/anakin_v2/model_v2/google_net/googlenet.anakin.bin";
std::string model_path = "/home/chaowen/anakin_v2/model_v2/yolo/yolo.anakin.bin";
std::string model_path = "/path/to/name.anakin.bin";


TEST(GraphTest, graph_load_model) {
Expand Down
2 changes: 1 addition & 1 deletion test/framework/net/net_exec_multi_thread_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ using Target_H = X86;
using Target = ARM;
using Target_H = ARM;
#endif
std::string model_path = "/home/chaowen/anakin_v2/model_v2/anakin-models/adu/anakin_models/diepsie_light_head/yolo_lane_v2.anakin.bin";
std::string model_path = "../benchmark/CNN/models/vgg16.anakin.bin";

#ifdef USE_CUDA
#if 0
Expand Down
36 changes: 1 addition & 35 deletions test/framework/net/net_exec_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,42 +16,8 @@ using Target_H = ARM;

//#define USE_DIEPSE

//std::string model_path = "/home/chaowen/anakin_v2/model_v2/anakin-models/adu/anakin_models/diepsie_light_head/diepsie_light_head.anakin.bin";

//std::string model_path = "/home/chaowen/anakin_v2/model_v2/anakin-models/adu/anakin_models/diepsie_light_head/diepsie_light_head_base.anakin.bin";


//std::string model_path = "/home/chaowen/anakin_v2/model_v2/anakin-models/adu/anakin_models/diepsie_light_head/densebox.anakin.bin";

//std::string model_path = "/home/chaowen/anakin_v2/model_v2/anakin-models/adu/anakin_models/diepsie_light_head/cnn_seg.anakin.bin";

//std::string model_path = "/home/chaowen/anakin_v2/model_v2/anakin-models/adu/anakin_models/diepsie_light_head/yolo_camera_detector.anakin.bin";

//std::string model_path = "/home/chaowen/anakin_v2/model_v2/anakin-models/adu/anakin_models/diepsie_light_head/yolo_lane_v2.anakin.bin";

// alignment of face
//std::string model_path = "/home/chaowen/anakin_v2/model_v2/anakin-models/adu/anakin_models/diepsie_light_head/net_deploy_stageI.anakin.bin";

//std::string model_path = "/home/chaowen/anakin_v2/model_v2/anakin-models/adu/anakin_models/diepsie_light_head/net_deploy_stageII.anakin.bin";

// residual 7 patch of face
//std::string model_path = "/home/chaowen/anakin_v2/model_v2/anakin-models/adu/anakin_models/diepsie_light_head/residual_net_7patch_3hc.anakin.bin";

// resnet 50
// std::string model_path = "/home/cuichaowen/anakin2/anakin2/benchmark/CNN/mobilenet_v2.anakin.bin";

// vgg16
// std::string model_path = "/home/cuichaowen/anakin2/anakin2/benchmark/CNN/models/vgg16.anakin.bin";

// resnet 101
// std::string model_path = "/home/cuichaowen/parsing/external_converter_v2/output/ResNet-101.anakin.bin";

// animal
// std::string model_path = "/home/cuichaowen/parsing/external_converter_v2/output/animal.anakin.bin";

//std::string model_path = "/home/cuichaowen/github_anakin/icode_model/anakin-models/vis/anakin-models/mainbody/mainbody.anakin2.bin";

std::string model_path = "/home/cuichaowen/github_anakin/icode_model/anakin-models/vis/anakin-models/MobileNetSSD/mobilenet-ssd_fluid.anakin2.bin";
std::string model_path = "../benchmark/CNN/models/vgg16.anakin.bin";

#ifdef USE_CUDA
#if 1
Expand Down
Loading

0 comments on commit 7856530

Please sign in to comment.