FedML 0.8.3

FedML-AI-admin released this 23 Apr 09:28

· 2988 commits to master since this release

What's Changed

New Features

[CoreEngine/MLOps] Introducing the FedML OTA (Over-the-Air) upgrade mechanism for the training platform and serving platform.
[Documents] Added guidance for the OTA mechanism in the user guide document.

Bug Fixes

[Serving] Fixed an issue where exceptions occurred when activating the model inference.
[CoreEngine] Fixed an issue where aggregator exceptions occurred when running MPI scripts.
[Documents] Fixed broken links in the user guide document.
[CoreEngine] Checked if the current job is empty in the get_current_job_status api.
[CoreEngine] Fixed a high CPU usage issue when the reload option was enabled in the client API.

Enhancements

[Serving] Improved data syncing between Redis server and Sqlite database.
[Serving] Implemented the use of triple elements (end point name/model name/model version) to identify each inference API request.
[DevOps] Updated Jenkinsfile to automate the building and deployment of the model serving Docker to the K8s cluster.
[Serving] Implemented the model monitor stop functionality when deactivating and deleting the model deployment.
[Serving] Checked the status of the end point when recovering on startup.
[CoreEngine] Refactored the OTA upgrade process for improved robustness.
[CoreEngine] Attach logs to the new Run ID when initiating a new run or deploying a model.
[CoreEngine] Refined upgrade status messages for enhanced clarity.

Assets 2