Releases: PaddlePaddle/Serving
Releases · PaddlePaddle/Serving
Release v0.9.0
新特性
- 集成 Paddle 2.3 Inference : #1781
- C++ Serving 异步框架自动批量 :#1685
- C++ Serving 适配 Jetpack 4.6 : #1700
- C++ Serving 异步框架支持2维 LOD Pedding : #1713
- 大模型分布式推理 :#1753,#1783
- C++ Serving 支持 TensorRT Dynamic Shape:#1759
功能增强
- 更新 C++ Serving OCR 的部署案例:#1759
- 新增 Python Pipeline 自动生成TRT 动态shape : #1778
- 新增 Python Pipeline 低精度部署案例:#1753
- 新增 离线Wheel 安装 :#1792
- 升级 前后端 protobuf Response 结构:#1783
文档和示例变更
- 新增 AIStudio OCR 实战(首页)
- 新增 政务问答解决方案(首页)
- 新增 智能问答解决方案(首页)
- 新增 语义索引解决方案(首页)
- 新增 PaddleNLP 示例: #1773
- 新增 doc/Install_Linux_Env_CN.md:#1788
- 新增 doc/Python_Pipeline/Pipeline_Int_CN.md:#1788
- 新增 doc/Python_Pipeline/Pipeline_Features_CN.md:#1788
- 新增 doc/Python_Pipeline/Pipeline_Optimize_CN.md:#1788
- 修改 README.md: #1788
- 修改 README_CN.md: #1788
- 修改 doc/C++_Serving/ABTest_CN.md:#1788
- 修改 doc/C++_Serving/Asynchronous_Framwork_CN.md:#1788
- 修改 doc/C++_Serving/Encryption_CN.md:#1788
- 修改 doc/C++_Serving/Hot_Loading_CN.md:#1788
- 修改 doc/C++_Serving/Inference_Protocols_CN.md:#1788
- 修改 doc/C++_Serving/Model_Ensemble_CN.md:#1788
- 修改 doc/C++_Serving/OP_CN.md:#1788
- 修改 doc/C++_Serving/Performance_Tuning_CN.md:#1788
- 修改 doc/C++_Serving/Request_Cache_CN.md:#1788
- 修改 doc/Compile_CN.md:#1788
- 修改 doc/Compile_EN.md:#1788
- 修改 doc/Docker_Images_CN.md:#1788
- 修改 doc/Docker_Images_EN.md:#1788
- 修改 doc/FAQ_CN.md:#1788
- 修改 doc/Install_CN.md:#1788
- 修改 doc/Install_EN.md:#1788
- 修改 doc/Java_SDK_CN.md:#1788
- 修改 doc/Java_SDK_EN.md:#1788
- 修改 doc/Latest_Packages_CN.md:#1788
- 修改 doc/Latest_Packages_EN.md:#1788
- 修改 doc/Model_Zoo_CN.md:#1788
- 修改 doc/Python_Pipeline/Pipeline_Benchmark_CN.md:#1788
- 修改 doc/Python_Pipeline/Pipeline_Design_CN.md:#1788
- 修改 doc/Python_Pipeline/Pipeline_Design_EN.md:#1788
- 修改 doc/Run_On_Kubernetes_CN.md:#1788
- 修改 doc/Save_CN.md:#1788
- 修改 doc/Serving_Auth_Docker_CN.md:#1788
- 修改 doc/Serving_Configure_CN.md:#1788
- 修改 doc/Serving_Configure_EN.md:#1788
Bug修复
Release v0.8.3
新特性
- 增加C++ Serving 和 Pipeline Serving编译环境检查 #1584
- C++ Serving 支持修改log日志生成路径 #1592
- 使用TRT时,新增动态shape配置功能和示例 #1590
- 新增Python Pipeline Serving 普罗米修斯监控 #1586
- 新增C++ Serving 普罗米修斯监控 #1568 #1576 #1577
- 支持异构硬件,包括:x86+DCU、ARM+ascend310、ARM+ascend910 #1544
- 支持Python39
性能优化
功能增强
- 更便捷的C++串联多模型方式 #1546
- dockerfile升级,新增centos dockerfile #1618 #1594
- 新增Pipeline Serving bf16低精度支持 #1594 #1554
文档和示例变更
- 新增pp-shitu示例 #1572
- 新增PaddleNLP示例 #1609
- 新增环境检查文档 #1643
- 新增动态TRT使用文档 #1643
- 新增异构硬件使用文档 #1641,#1654
- 新增请求缓存Cache使用说明文档 #1641, #1588
Bug修复
For English:
New features
- Add C++ serving and pipeline serving compilation environment check #1584
- C++ serving supports modifying the log generation path #1592
- When using TRT, new dynamic shape configuration functions and examples are added #1590
- Add Python pipeline serving Prometheus monitoring #1586
- Add C++ serving Prometheus monitoring #1568 #1576 #1577
- Support heterogeneous hardware, including x86 + DCU, arm + ascend310 and arm + ascend910 #1544
- Support Python 39
Performance optimization
- C++ serving adds the request result caching function, and the same request is directly returned #1585, #1588
Function Enhance
- More convenient C++ series multi model mode #1546
- Dockerfile upgrade, new Centos dockerfile #1618 #1594
- New pipeline serving bf16 low precision support #1594 #1554
Documentation and sample changes
- New PP-Shitu example #1572
- New paddlenlp example #1609
- New environmental inspection document #1643
- New dynamic TRT usage document #1643
- New heterogeneous hardware usage documents #1641, #1654
- New request cache instructions #1641, #1588
Bug repair
Release v0.7.0
新特性
- 集成Intel MKLDNN加速推理 #1264,#1266, #1277
- C++ Serving支持HTTP 请求 #1321
- C++ Serving支持gPRC 和HTTP + Proto请求 #1345
- 新增C++ Client SDK #1370
性能优化
- C++ Serving优化Pybind传递数据方法 #1268, #1269
- C++ Serving增加GPU多流、异步任务队列,删除冗余加锁 #1289
- C++ Serving WebServer使用连接池和数据压缩 #1348
- C++ Serving框架新增异步批量合并,支持变长LOD输入 #1366
- C++ Serving stage并发执行 #1376
- C++ Serving增加各阶段处理耗时日志 #1390
功能变更
- 重写模型保存方案和命名规则,兼容旧版本 #1354,#1358
- 支持更多数据类型float64,int16, float16, uint16, uint8, int8, bool , complex64 , complex128 #1338
- 重写GPU id设置device的逻辑 #1303
- 指定Fetch list返回部分推理结果 #1359
- 设置XPU ID #1436
- 服务优雅关闭 #1470
- C++ Serving Client端pybind支持uint8、int8数据读写 #1378
- C++ Serving Client端pybind支持uint16、int16数据读写 #1420
- C++ Serving支持异步参数设置 #1483
- Python Pipeline增加While OP控制循环 #1338
- Python pipeline之间可使用gRPC交互 #1358
- Python Pipeline 支持Proto结构Tensor数据格式交互 #1369, #1384
- Python Pipeline仅获取最快的前置OP结果 #1380
- Python Pipeline 支持LoD类型输入 #1472
- Cube服务新增python http方式请求样例 #1399
- Cube服务增加读取RecordFile工具 #1336
- Cube-server和Cube-transfer上线部署优化 #1337
- 删除multi-lang相关代码 #1321
文档和示例变更
- 修改Doc目录结构,新增子目录 #1473, #1475
- 迁移Serving/python/examples路径到Serving/examples,重新设计目录 #1487
- 修改doc文件名称 #1487
- 新增C++ Serving Benchmark #1176
- 新增PaddleClas/DarkNet 加密模型部署示例 #1352
- 新增Model Zoo文档 #1492
- 新增Install文档 #1473
- 新增Quick_Start文档 #1473
- 新增Serving_Configure文档 #1495
- 新增C++_Serving/Inference_Protocols_CN.md #1500
- 新增C++_Serving/Introduction_CN.md #1497
- 新增C++_Serving/Performance_Tuning_CN.md #1497
- 新增Python_Pipeline/Performance_Tuning_CN.md #1503
- 更新Java SDK文档 #1357
- 更新Compile文档 #1502
- 更新Readme文档 #1473
- 更新Latest_Package_CN.md #1513
- 更新Run_On_Kubernetes_CN.md #1520
Bug修复
- 修复内存池使用问题 #1283
- 修复多线程中错误加锁问题 #1289
- 修复C++ Serving多模型组合场景,无法加载第二个模型问题 #1294
- 修复请求数据大时越界问题 #1308
- 修复Detection模型结果偏离问题 #1413
- 修复use_calib设置错误问题 #1414
- 修复C++ OCR示例结果不正确问题 #1415
- 修复并行推理出core问题 #1417
For English:
New Features
- Integrate Intel MKLDNN #1264,#1266, #1277
- C++ Serving supports HTTP requests #1321
- C++ Serving supports gPRC and HTTP + Proto requests #1345
- Added C++ Client SDK #1370
Performance optimization
- C++ Serving optimizes Pybind data transfer method #1268, #1269
- C++ Serving adds GPU multi-stream, asynchronous task queue, deletes redundant locks #1289
- C++ Serving webserver uses connection pool and data compression #1348
- C++ Serving framework adds asynchronous batch merge and supports variable length LOD input #1366
- C++ Serving stage concurrent execution #1376
- C++ Serving adds time-consuming log processing at each stage #1390
Function changes
- Rewrite model saving methods and naming rules, compatible with the old version #1354,#1358
- Support more data types float64, int16, float16, uint16, uint8, int8, bool, complex64, complex128 #1338
- Rewrite the method of GPU id binding device #1303
- Specify Fetch list to return partial inference results #1359
- Set XPU ID #1436
- Service closed gracefully #1470
- C++ Serving Client pybind supports uint8, int8 data #1378
- C++ Serving Client pybind supports uint16, int16 data #1420
- C++ Serving supports asynchronous parameter setting #1483
- Python Pipeline adds While OP control loop #1338
- GRPC interaction can be used between Python pipelines #1358
- Python Pipeline supports Proto structure Tensor data format interaction #1369, #1384
- Python Pipeline only gets the fastest pre-OP results #1380
- Python Pipeline supports LoD type input #1472
- Cube service adds python http request sample #1399
- Cube service adds a tool to read RecordFile #1336
- Cube-server and Cube-transfer online deployment optimization #1337
- Delete multi-lang related code #1321
Documentation and example changes
- Modify the Doc directory structure and add subdirectories #1473, #1475
- Move python/examples path to parent directory, and redesign directory #1487
- Modify the doc file name #1487
- Add C++ Serving Benchmark #1176
- Add one PaddleClas/DarkNet encryption model example #1352
- Add Model Zoo doc #1492
- Add Install doc #1473
- Add Quick Start doc #1473
- Add Serving Configure doc #1495
- Add C++_Serving/Inference_Protocols_CN.md#1500
- Add C++_Serving/Introduction_CN.md#1497
- Add C++_Serving/Performance_Tuning_CN.md#1497
- Add Python_Pipeline/Performance_Tuning_CN.md#1503
- Update Java SDK doc #1357
- Update Compile doc #1502
- Update Readme doc #1473
- Update Latest_Package_CN.md#1513
- Update Run_On_Kubernetes_CN.md#1520
Bug fix
- Fix one memory pool usage problem #1283
- Fix the wrong locking problem in multi-threading #1289
- Fix the problem of C++ Serving multi-model combination #1294
- Fix the problem of out of bounds when the requested data is large #1308
- Fix the problem of inaccurate prediction results of the Detection model #1413
- Fix the wrong setting of use_calib #1414
- Fix the problem of incorrect C++ OCR example results #1415
- Fix the core problem of parallel reasoning #1417
Release v0.6.0
Paddle Serving v0.6.0 Release note:
- 新特性:
- 功能增强:
- Python合并paddle_serving_server和paddle_serving_server_gpu成统一服务, #1082
- Pipeline增加mini-batch推理, #1186
- Pipeline支持日志切割, #1238
- Pipeline优化数据传入eval处理,增加channel的跟踪日志, #1209
- C++ Serving重构预测库调用方法,#1080
- C++ Serving支持多模型线性组合,#1124
- C++ Serving资源管理与优化, #1143
- C++ Serving接口增加String类型输入, #1124
- C++ Serving优化数据组装方法,使用memcpy替换循环拷贝, #1124
- C++ Serving编译选型增加GDB开关, #1124
- 增加Benchmark脚本,更新GPU benchmark数据, #1197, #1175
- 文档升级:
- 新增 doc/PADDLE_SERVING_ON_KUBERNETES.md
- 新增 doc/LOD.md
- 新增 doc/LOD_CN.md
- 新增 doc/PROCESS_DATA.md
- 修改 doc/PIPELINE_SERVING.md
- 修改 doc/PIPELINE_SERVING_CN.md
- 修改 doc/CREATING.md
- 修改 doc/SAVE.md
- 修改 doc/SAVE_CN.md
- 修改 doc/TENSOR_RT.md
- 修改 doc/TENSOR_RT_CN.md
- 修改 doc/MULTI_SERVICE_ON_ONE_GPU_CN.md
- 修改 doc/ENCRYPTION.md
- 修改 doc/ENCRYPTION_CN.md
- 修改 doc/DESIGN_DOC.md
- 修改 doc/DESIGN_DOC_CN.md
- 修改 doc/DOCKER_IMAGES.md
- 修改 doc/DOCKER_IMAGES_CN.md
- 修改 doc/LATEST_PACKAGES.md
- 修改 doc/COMPILE.md
- 修改 doc/COMPILE_CN.md
- 修改 doc/BERT_10_MINS.md
- 修改 doc/BERT_10_MINS_CN.md
- 修改 doc/BAIDU_KUNLUN_XPU_SERVING.md
- 修改 doc/BAIDU_KUNLUN_XPU_SERVING_CN.md
- 修改 README.md
- 修改 README_CN.md
- Demo升级:
- 新增 python/python/examples/low_precision/resnet50
- 新增 python/examples/xpu/bert
- 新增python/examples/xpu/ernie
- 新增 python/examples/xpu/vgg19
- 新增 python/examples/pipeline/PaddleDetection/faster_rcnn
- 新增 python/examples/pipeline/PaddleDetection/ppyolo_mbv3
- 新增 python/examples/pipeline/PaddleDetection/yolov3
- 新增 python/examples/pipeline/PaddleClas/DarkNet53
- 新增 python/examples/pipeline/PaddleClas/HRNet_W18_C
- 新增 python/examples/pipeline/PaddleClas/MobileNetV1
- 新增 python/examples/pipeline/PaddleClas/MobileNetV2
- 新增 python/examples/pipeline/PaddleClas/MobileNetV3_large_x1_0
- 新增 python/examples/pipeline/PaddleClas/ResNeXt101_vd_64x4d
- 新增 python/examples/pipeline/PaddleClas/ResNet50_vd
- 新增 python/examples/pipeline/PaddleClas/ResNet50_vd_FPGM
- 新增 python/examples/pipeline/PaddleClas/ResNet50_vd_KL
- 新增 python/examples/pipeline/PaddleClas/ResNet50_vd_PACT
- 新增 python/examples/pipeline/PaddleClas/ResNet_V2_50
- 新增 python/examples/pipeline/PaddleClas/ShuffleNetV2_x1_0
- 新增 python/examples/pipeline/bert
- 新增 python/examples/ocr/ocr_cpp_client.py
- 修改 python/examples/bert [benchmark]
- 修改 python/examples/pipeline/ocr[benchmark]
- docker升级:
- 新增docker运行镜像(CPU, cuda10.1, cuda10.2, cuda11) (Py36, Py37, Py38)
- 新增Cuda 11环境的开发docker镜像
- 新增Kubernetes Demo镜像
- Bug修复:
- 修复不规范代码命名,统一infer. h文件和paddle_engine. h中模型参数的命名规范. #1136
- 修复C++部分框架被绕过的错误. #1124
- 修复py35下Json.load函数异常的错误.#1124
- 修复ssd_vgg16_300_240e_voc示例中feed_var缺少参数'im_shape'导致的预测结果异常的错误.#1180
- 修复多个GRPC因模型路径变更导致的错误.#1147
- 修复C++log日志打印异常的错误. #1154
- 修复WebService漏传Thread参数的错误. #1136
- 修复golang引入的编译错误. #1101
- 修复Java gRPC模型下的错误. #1215
For English
- New Features:
- Feature Improvements:
- Merge paddle_serving_server and paddle_serving_server_gpu into a unified paddle_serving_server, #1082
- Pipeline supports Mini-batch inference, #1186
- Pipeline supports log file rotating, #1238
- Pipeline optimizes data transfer to eval for processing, and increases channel tracking logs, #1209
- C++ Serving reconstruction prediction engine call method, #1080
- C++ Serving supports linear combination of multiple models, #1124
- C++ Serving interface adds direct input of String type, #1124
- C++ Serving resource management and optimization, #1143
- C++ Serving performance optimization, changing for loop copy to function memcpy, #1124
- C++ Serving add GDB compilation options, #1124
- Add Benchmark script and update GPU benchmark data, #1197, #1175
- Document Updates:
- Add doc/PADDLE_SERVING_ON_KUBERNETES.md
- Add doc/LOD.md
- Add doc/LOD_CN.md
- Add doc/PROCESS_DATA.md
- Modify doc/PIPELINE_SERVING.md
- Modify doc/PIPELINE_SERVING_CN.md
- Modify doc/CREATING.md
- Modify doc/SAVE.md
- Modify doc/SAVE_CN.md
- Modify doc/TENSOR_RT.md
- Modify doc/TENSOR_RT_CN.md
- Modify doc/MULTI_SERVICE_ON_ONE_GPU_CN.md
- Modify doc/ENCRYPTION.md
- Modify doc/ENCRYPTION_CN.md
- Modify doc/DESIGN_DOC.md
- Modify doc/DESIGN_DOC_CN.md
- Modify doc/DOCKER_IMAGES.md
- Modify doc/DOCKER_IMAGES_CN.md
- Modify doc/LATEST_PACKAGES.md
- Modify doc/COMPILE.md
- Modify doc/COMPILE_CN.md
- Modify doc/BERT_10_MINS.md
- Modify doc/BERT_10_MINS_CN.md
- Modify doc/BAIDU_KUNLUN_XPU_SERVING.md
- Modify doc/BAIDU_KUNLUN_XPU_SERVING_CN.md
- Modify README.md
- Modify README_CN.md
- Demo Updates:
- Add python/python/examples/low_precision/resnet50
- Add python/examples/xpu/bert
- Add python/examples/xpu/ernie
- Add python/examples/xpu/vgg19
- Add python/examples/pipeline/PaddleDetection/faster_rcnn
- Add python/examples/pipeline/PaddleDetection/ppyolo_mbv3
- Add python/examples/pipeline/PaddleDetection/yolov3
- Add python/examples/pipeline/PaddleClas/DarkNet53
- Add python/examples/pipeline/PaddleClas/HRNet_W18_C
- Add python/examples/pipeline/PaddleClas/MobileNetV1
- Add python/examples/pipeline/PaddleClas/MobileNetV2
- Add python/examples/pipeline/PaddleClas/MobileNetV3_large_x1_0
- Add python/examples/pipeline/PaddleClas/ResNeXt101_vd_64x4d
- Add python/examples/pipeline/PaddleClas/ResNet50_vd
- Add python/examples/pipeline/PaddleClas/ResNet50_vd_FPGM
- Add python/examples/pipeline/PaddleClas/ResNet50_vd_KL
- Add python/examples/pipeline/PaddleClas/ResNet50_vd_PACT
- Add python/examples/pipeline/PaddleClas/ResNet_V2_50
- Add python/examples/pipeline/PaddleClas/ShuffleNetV2_x1_0
- Add python/examples/pipeline/bert
- Add python/examples/ocr/ocr_cpp_client.py
- Modify python/examples/bert [benchmark]
- Modify python/examples/pipeline/ocr[benchmark]
- Docker Updates:
- Add runtime dockers (CPU, CUDA10.1, CUDA10.2, CUDA11) (Py36, Py37, Py38)
- Add CUDA 11 develop level docker images
- Add kubernetes demo images
- Bug Fixes:
- Fixed the problem of irregular naming, #1136
- Fixed the problem that part of C + + multithreading and framework were bypassed due to the adaptation of paddle-inference2.0. #1124
- Fixed the problem of JSON. Load in py35.#1124
- Fixed missing a feed_var: 'im_shape' in the test-client request, resulting in no prediction result.#1180
- Fixed multiple bugs in gRPC.#1147
- Fixed the read OP print log logic bug in C + +. #1154
- Fixed the WebService missed a thread parameter, unified the template name in infer. h and paddle_engine. h. #1136
- Fixed compile errors of golang import. #1101
- Fixed Java gRPC bugs, #1215
Release v0.5.0
-
New Features
- Support paddle 2.0 API
- Support dynamic graph model conversion to Serving model interface
- Add java pipeline client
- Support model encryption and decryption
- Adapt shape to webserver automatically
- Predict on XPU, ARM
-
Improve
- Add more Nvidia TensorRT demos
- Add more dockers on Ubuntu
- Support python 3.8
- Support batch predict on pipeline serving
-
Documents
- Add doc/BAIDU_KUNLUN_XPU_SERVING.md
- Add doc/BAIDU_KUNLUN_XPU_SERVING_CN.md
- Add doc/ENCRYPTION.md
- Add doc/ENCRYPTION_CN.md
- Modify README.md
- Modify README_CN.md
- Modify doc/COMPILE_CN.md
- Modify doc/DOCKER_IMAGES_CN.md
- Modify doc/LATEST_PACKAGES.md
- Modify doc/RUN_IN_DOCKER_CN.md
- Modify doc/SAVE_CN.md
- Modify doc/ABTEST_IN_PADDLE_SERVING.md
- Modify doc/COMPILE.md
- Modify doc/DESIGN_DOC_CN.md
- Modify doc/DESIGN_DOC.md
- Modify doc/GRPC_IMPL_CN.md
- Modify doc/JAVA_SDK.md
- Modify doc/JAVA_SDK_CN.md
- Delete doc/INFERENCE_TO_SERVING_CN.md
- Delete doc/TRAIN_TO_SERVICE.md
- Delete doc/TRAIN_TO_SERVICE_CN.md
-
Demos
- Add examples/xpu/fit_a_line_xpu
- Add examples/xpu/resnet_v2_50_xpu
- Add examples/detection/faster_rcnn_r50_fpn_1x_coco
- Add examples/detection/ppyolo_r50vd_dcn_1x_coco
- Add examples/detection/ttfnet_darknet53_1x_coco
- Add examples/detection/yolov3_darknet53_270e_coco
- Add examples/encryption
- Modify examples/bert
- Modify examples/criteo_ctr
- Modify examples/fit_a_line
- Modify examples/grpc_impl_example
- Modify examples/imdb
- Modify examples/ocr
- Modify examples/pipeline/imagenet
- Modify examples/pipeline/imdb_model_ensemble
- Modify examples/pipeline/ocr
-
Dockers
- Add Docker : CPU on Ubuntu16 【GCC82】
- Add Docker : CUDA9.0 on Ubuntu16 【GCC482】
- Add Docker : CUDA10.0 on Ubuntu16 【GCC482】
- Add Docker : CUDA10.1 on Ubuntu16 【GCC82】
- Add Docker : CUDA10.2 on Ubuntu16 【GCC82 】
- Add Docker : CUDA11.0 on Ubuntu18 【GCC82】
- Add Docker : ARM CPU on CentOS8 【GCC73】
-
Fix Bugs
- Exception of batch GRPC requests
- Exception of pipeline batch query
- Wrong results of Predicting Yolov4 models on java client.
- Codec is inconsistent between Python 2.7 and Python 3.x
Release v0.4.0
- New Features
- Support Java Client
- Support TensorRT, add docker image for cuda10.1 and TensorRT 6
- Modify the LocalPredictor interface to align with the RPC interface usage
- Add Pipeline Serving Dag Deployment
- Support Windows 10 (Only Web Service and Local Predictor)
- Add built-in Serving model converter
- Improve of Compatibility
- Release cuda10.1 version of paddle_serving_server_gpu
- Release Python 3.5 version of paddle_serving_client
- Remove serving-client-app circular dependencies
- Modify version of dependencies
- Support LoD Tensor and replace list type for batch input with one numpy array
- Improve of Framework
- Pipeline Dag support multi-gpu
- lower RPC thread restriction to 1
- Documents
- Modify "COMPILE"
- Add "WINDOWS_TUTORIAL"
- Add "PIPELINE_SERVING"
- Modify "BERT_10_MIN"
- New Demo
- pipeline demos
- Java Demo
- Bug fixes
- Fix subprocess CUDA ERROR3 bug
- Fix pip install dependencies
- Fix import error in windows
- Fix bugs in web service
Release v0.3.2
- New Features
- Support Paddle v1.8.4
- Support int64 data type
- Add mem_optim and ir_optim api for WebService
- Add preprocess and postprocess api for ocr in paddle_serving_app
- Add docker for cuda10
- Improve of Compatibility
- Release cuda10 version of paddle_serving_server_gpu
- Improve of Framework
- Optimize the error message in http mode
- Reduce GPU server-side graphic memory usage.
- Documents
- Modify "How to optimize performance?","Compile from source code","FAQ(Chinese)"
- New Demo
- yolov4
- ocr
- Bug fixes
- Add version requirements for protobuf to avoid import paddle_serving_client errors caused by low versions, #728
- fix compatibility issues for http mode with python3
- fix ctr_with_cube demo
- fix cpu docker
- fix compatibility issues for BlazeFacePostprocess wiht python3
Release v0.3.0
- New features
- Add ir_optim, use_mkl(only for cpu version)argument
- Support custom DAG for prediction service
- HTTP service supports prediction with batch
- HTTP service supports startup by uwsgi
- Support model file monitoring, remote pull and hot loading
- Support ABTest
- Add image preprocessing, Chinese word segmentation preprocessing, Chinese sentiment analysis preprocessing module, and graphics segmentation postprocessing, image detection postprocessing module in paddle-serving-app
- Add pre-trained model and sample code acquisition in paddle-serving-app, integrated profile function
- Release Centos6 docker images for compile Paddle Serving
- Bug fixed
- New documents
- Performance optimization
- Optimized the time consumption of input and output memory copy in numpy.array format. When the client-side single concurrent batch size is 1 in the resnet50 imagenet classification task, qps is 100.38% higher than the 0.2.0 version.
- Compatibility optimization
- The client side removes the dependency on patchelf
- Released paddle-serving-client for python27, python36, and python37
- Server and client can be deployed in Centos6/7 and Ubuntu16/18 environments
- More demos
- Chinese sentiment analysis task : lac+senta
- Image segmentation task : deeplabv3、unet
- Image detection task : faster_rcnn
- Image classification task : mobilenet、resnet_v2_50
Release v0.2.0
Major Features and Improvements
-
Support Paddle v1.7.1
-
Improve ease of use.
Support install Paddle Serving with pip command and docker.
Integrate with Paddle Training seamlessly.
Start server with one command.
Web service development supported.
-
Provide two prediction service methods : RPC and HTTP.
-
Add client api support for Python, Go.
-
CV, NLP and recommendation serving demos released.
-
Add Timeline tools for analysis service performance.
-
Performance Improvement: with python rpc python client, throughputs are improved over 100% compared with HTTP service in previous release.
Thanks to our Contributors
This release contains contributions from many people at Baidu, as well as:
guru4elephant, wangjiawei04, MRXLT, barrierye
Release v0.0.3
- Support PaddlePaddle v1.6.1
- Distributed sparse parameter service: Cube. Cube is a high-performance distributed KV service designed to work in Deep Learning context, which bad been tested and heavily used inside Baidu
- New module added as demo: BERT on GPU
- Elastic-CTR: Built around PaddlePaddle, Paddle Serving and Cube, Elastic-CTR is an end-to-end distributed training and serving solution. It's built totally on K8S, meaning that users could easily deploy our solution on private clusters. See Elastic CTR solution deployment
- Build optimization: Now Paddle inference libraries will be downloaded from PaddlePaddle official website, instead of built from source
- Build optimizatioin: Dockerfiles are provided for buiding CPU and GPU Serving binaries. See INSTALL instructions