FedML 0.8.3
FedML-AI-admin
released this
23 Apr 09:28
·
2988 commits
to master
since this release
What's Changed
New Features
- [CoreEngine/MLOps] Introducing the FedML OTA (Over-the-Air) upgrade mechanism for the training platform and serving platform.
- [Documents] Added guidance for the OTA mechanism in the user guide document.
Bug Fixes
- [Serving] Fixed an issue where exceptions occurred when activating the model inference.
- [CoreEngine] Fixed an issue where aggregator exceptions occurred when running MPI scripts.
- [Documents] Fixed broken links in the user guide document.
- [CoreEngine] Checked if the current job is empty in the get_current_job_status api.
- [CoreEngine] Fixed a high CPU usage issue when the reload option was enabled in the client API.
Enhancements
- [Serving] Improved data syncing between Redis server and Sqlite database.
- [Serving] Implemented the use of triple elements (end point name/model name/model version) to identify each inference API request.
- [DevOps] Updated Jenkinsfile to automate the building and deployment of the model serving Docker to the K8s cluster.
- [Serving] Implemented the model monitor stop functionality when deactivating and deleting the model deployment.
- [Serving] Checked the status of the end point when recovering on startup.
- [CoreEngine] Refactored the OTA upgrade process for improved robustness.
- [CoreEngine] Attach logs to the new Run ID when initiating a new run or deploying a model.
- [CoreEngine] Refined upgrade status messages for enhanced clarity.