Merge pull request #5642 from FederatedAI/dev-2.1.1

Merge 2.1.1's update into Main Branch
FederatedAI · Jun 28, 2024 · d152756 · d152756
2 parents 6731122 + 62bbd58
commit d152756
Show file tree

Hide file tree

Showing 42 changed files with 1,544 additions and 60 deletions.
diff --git a/.readthedocs.yml b/.readthedocs.yml
@@ -0,0 +1,15 @@
+version: 2
+
+mkdocs:
+ # configuration: .readthedocs.mkdocs.yml
+ fail_on_warning: false
+
+formats: all
+
+build:
+ os: ubuntu-22.04
+ tools:
+ python: "3.8"
+python:
+ install:
+ - requirements: doc/mkdocs/requirements.txt
diff --git a/README.md b/README.md
@@ -35,8 +35,12 @@ Deploying FATE to multiple nodes to achieve scalability, reliability and managea
 - [Cluster deployment by CLI](./deploy/cluster-deploy): Using CLI to deploy a FATE cluster.
 
 ### Quick Start
-- [Training Demo With Installing FATE Only From Pypi](doc/2.0/fate/ml)
-- [Training Demo With Installing FATE AND FATE-Flow From Pypi](doc/2.0/fate/quick_start.md)
+- [Training Demo with Only FATE Installed From Pypi](doc/2.0/fate/ml)
+- [Training Demo with Both FATE AND FATE-Flow Installed From Pypi](doc/2.0/fate/quick_start.md)
+
+### Advanced Use
+- [Train & Predict for Homo Mode](./doc/2.0/fate/homo_quick_start.md)
+- [Run ML Launchers](./doc/README.md#run-ml-modulessince-v200)
 
 ### More examples
 - [ML examples](examples/launchers)
@@ -53,6 +57,11 @@ Deploying FATE to multiple nodes to achieve scalability, reliability and managea
 - [RoadMap](./doc/images/roadmap.png)
 - [Paper & Conference](./doc/resources/README.md)
 
+### Develop Guide
+- [Dag Usage Guide](./doc/2.0/fate/dag.md)
+- [Component Develop Guide](./doc/develop_guide/component_guide.md) 
+
+
 ## Related Repositories (Projects)
 - [KubeFATE](https://github.com/FederatedAI/KubeFATE): An operational tool for the FATE platform using cloud native technologies such as containers and Kubernetes.
 - [FATE-Flow](https://github.com/FederatedAI/FATE-Flow): A multi-party secure task scheduling platform for federated learning pipeline.
@@ -65,6 +74,7 @@ Deploying FATE to multiple nodes to achieve scalability, reliability and managea
 - [FATE-Client](https://github.com/FederatedAI/FATE-Client): A tool to enable fast federated modeling tasks for FATE.
 - [FATE-Test](https://github.com/FederatedAI/FATE-Test): An automated testing tool for FATE, including tests and benchmark comparisons.
 - [FATE-LLM](https://github.com/FederatedAI/FATE-LLM/blob/main/README.md) : A framework to support federated learning for large language models(LLMs).
+
 ## Governance 
 
 [FATE-Community](https://github.com/FederatedAI/FATE-Community) contains all the documents about how the community members coopearte with each other. 

diff --git a/RELEASE.md b/RELEASE.md
@@ -1,3 +1,12 @@
+## Release 2.1.1
+### Major Features and Improvements
+> Component
+* Support server model saving in Homo-NN
+
+> ML
+* aggregator support aggregation of torch.bfloat16 data type
+
+
 ## Release 2.1.0
 ### Major Features and Improvements
 > Arch

diff --git a/deploy/standalone-deploy/README.md b/deploy/standalone-deploy/README.md
@@ -36,7 +36,7 @@ FATE-Flow provides federated job life cycle management, includes scheduling, dat
 
 ##### 2.2.1.1 Installing FATE, FATE-Flow, FATE-Client
 ```shell
-pip install fate_client[fate,fate_flow]==2.1.0
+pip install fate_client[fate,fate_flow]==2.1.1
 ```
 #### 2.2.1.2 Service Initialization
 ```shell
@@ -71,7 +71,7 @@ users can directly import fate and use built-in algorithms and secure protocols
 
 #### 2.2.2.1 Installing FATE
 ```shell
-pip install pyfate==2.1.0
+pip install pyfate==2.1.1
 ```
 #### 2.2.2.2 Using Guides
 Refer to [examples](../../doc/2.0/fate/ml)
@@ -90,13 +90,13 @@ Refer to [examples](../../doc/2.0/fate/ml)
 Set the necessary environment variables for deployment (note that environment variables set in this way are only valid for the current terminal session. If you open a new terminal session, such as logging in again or opening a new window, you will need to reset them).
 
 ```bash
-export version={FATE version number for this deployment, e.g., 2.1.0}
+export version={FATE version number for this deployment, e.g., 2.1.1}
 ```
 
 Example:
 
 ```bash
-export version=2.1.0
+export version=2.1.1
 ```
 
 ### 3.2 Pull Docker Images

diff --git a/deploy/standalone-deploy/README.zh.md b/deploy/standalone-deploy/README.zh.md
@@ -35,7 +35,7 @@ FATE-Flow提供了联邦作业生命周期管理，包括调度、数据管理
 
 ##### 2.2.1.1 安装FATE、FATE-Flow、FATE-Client
 ```shell
-pip install fate_client[fate,fate_flow]==2.1.0
+pip install fate_client[fate,fate_flow]==2.1.1
 ```
 
 #### 2.2.1.2 服务初始化
@@ -70,7 +70,7 @@ FATE提供多种联邦算法和安全协议， 用户可以在安装 FATE 后直
 
 #### 2.2.1.1 安装 FATE
 ```shell
-pip install pyfate==2.1.0
+pip install pyfate==2.1.1
 ```
 
 #### 2.2.2.2 使用指引
@@ -92,13 +92,13 @@ pip install pyfate==2.1.0
 设置部署所需环境变量（注意，通过以下方式设置的环境变量仅在当前终端会话中有效。如果打开新的终端会话，例如重新登录或打开新窗口，请重新设置）。
 
 ```bash
-export version={本次部署的 FATE 版本号, 如 2.1.0}
+export version={本次部署的 FATE 版本号, 如 2.1.1}
 ```
 
 示例：
 
 ```bash
-export version=2.1.0
+export version=2.1.1
 ```
 
 ### 3.2 拉取镜像

diff --git a/doc/2.0/fate/components/hetero_nn.md b/doc/2.0/fate/components/hetero_nn.md
@@ -4,9 +4,7 @@ In FATE-2.0, we introduce our new Hetero-NN framework which allows you to quickl
 
 The architecture of the Hetero-NN framework is depicted in the figure below. In this structure, all submodels from guests and hosts are encapsulated within the HeteroNNModel, enabling independent forwards and backwards. Both guest and host trainers are developed based on the HuggingFace trainer, allowing for rapid configuration of heterogeneous federated learning tasks with your existing datasets and models. These tasks can be run independently, without the need for FATEFlow. The FATE-pipeline Hetero-NN components are built upon this foundational framework.
 
-<div align="center">
- <img src="../../images/hetero_nn.png" width="800" height="480" alt="Figure 2 (FedPass)">
-</div>
+![Figure 1 (HeteroNN)](../../images/hetero_nn.png)
 
 Besides the new framework, we also introduce two new privacy-preserving strategies for federated learning: SSHE and FedPass. These strategies can be configured in the aggregate layer configuration. For more information on these strategies, refer to the [SSHE](#sshe) and [FedPass](#fedpass) sections below.
 
@@ -16,24 +14,20 @@ SSHENN is a privacy-preserving strategy that uses homomorphic encryption and sec
 Secure Large-Scale Sparse Logistic Regression and Applications
 in Risk Control](https://arxiv.org/pdf/2008.08753.pdf).
 
-![Figure 1 (SSHE)](../../images/sshe.png)
+![Figure 2 (SSHE)](../../images/sshe.png)
 
 
 
 ## FedPass
 
 FedPass works by embedding private passports into a neural network to enhance privacy and obfuscation. It utilizes the DNN passport technique for adaptive obfuscation, which involves inserting a passport layer into the network. This layer adjusts the scale factor and bias term using model parameters and private passports, followed by an autoencoder and averaging. Picture below illustrates
 the process of FedPass.
-<div align="center">
- <img src="../../images/fedpass_1.png" alt="Figure 2 (FedPass)">
-</div>
 
+![Figure 3 (FedPass)](../../images/fedpass_1.png)
 
 In FATE-2.0, you can specify the Fedpass strategy for guest top model and host bottom model, picture below shows the architecture of FedPass when running a hetero-nn task.
 
-<div align="center">
- <img src="../../images/fedpass_0.png" width="500" height="400" alt="Figure 2 (FedPass)">
-</div>
+![Figure 4 (FedPass)](../../images/fedpass_0.png)
 
 For more details of Fedpass, please refer to the [paper](https://arxiv.org/pdf/2301.12623.pdf).
 

diff --git a/doc/2.0/fate/dag.md b/doc/2.0/fate/dag.md
@@ -90,7 +90,7 @@ dag:
  max_depth: 3
  num_trees: 2
  ...
-schema_version: 2.1.0
+schema_version: 2.1.1
 kind: fate
 ```
 
@@ -309,7 +309,7 @@ dag:
  max_depth: 3
  num_trees: 2
  ...
-schema_version: 2.1.0
+schema_version: 2.1.1
 ```
 
 - Step1: Change the job-level stage in DAG to "predict"

diff --git a/doc/2.0/fate/dag.zh.md b/doc/2.0/fate/dag.zh.md
@@ -89,7 +89,7 @@ dag:
  max_depth: 3
  num_trees: 2
  ...
-schema_version: 2.1.0
+schema_version: 2.1.1
 kind: fate
 ```
 
@@ -302,7 +302,7 @@ dag:
  max_depth: 3
  num_trees: 2
  ...
-schema_version: 2.1.0
+schema_version: 2.1.1
 ```
 - Step1: 将dag下的全局job阶段的stage改成predict
 - Step2: 将用不到的组件从dag下的tasks，以及party_tasks进行删除，同时需要注意的是，删除组件可能会导致部分下游组件的dependent，以及输入发生改变，也需要对应修改。
@@ -338,4 +338,4 @@ inputs:
  producer_task: sbt_0
 ```
 
-修改完成后，该配置可以直接适用于预测
+修改完成后，该配置可以直接适用于预测
diff --git a/doc/2.0/fate/homo_quick_start.md b/doc/2.0/fate/homo_quick_start.md
@@ -0,0 +1,144 @@
+## Quick Start with Homo NN
+
+1. install `fate_client` with extra package `fate` 
+
+```sh
+python -m pip install -U pip && python -m pip install fate_client[fate,fate_flow]==2.1.1
+```
+after installing packages successfully, initialize fate_flow service and fate_client
+
+```sh
+mkdir fate_workspace
+fate_flow init --ip 127.0.0.1 --port 9380 --home $(pwd)/fate_workspace
+pipeline init --ip 127.0.0.1 --port 9380
+
+fate_flow start
+fate_flow status # make sure fate_flow service is started
+```
+
+
+2. download example data
+
+```sh
+wget https://raw.githubusercontent.com/wiki/FederatedAI/FATE/example/data/breast_homo_guest.csv && \
+wget https://raw.githubusercontent.com/wiki/FederatedAI/FATE/example/data/breast_homo_host.csv
+```
+
+3. transform example data to dataframe using in fate
+```python
+import os
+from fate_client.pipeline import FateFlowPipeline
+
+
+base_path = os.path.abspath(os.path.join(__file__, os.path.pardir))
+guest_data_path = os.path.join(base_path, "breast_homo_guest.csv")
+host_data_path = os.path.join(base_path, "breast_homo_host.csv")
+
+data_pipeline = FateFlowPipeline().set_parties(local="0")
+guest_meta = {
+ "delimiter": ",", "dtype": "float64", "label_type": "int64", "label_name": "y", "match_id_name": "id"
+}
+host_meta = {
+ "delimiter": ",", "dtype": "float64", "label_type": "int64", "label_name": "y", "match_id_name": "id"
+}
+data_pipeline.transform_local_file_to_dataframe(file=guest_data_path, namespace="experiment", name="breast_homo_guest",
+ meta=guest_meta, head=True, extend_sid=True)
+data_pipeline.transform_local_file_to_dataframe(file=host_data_path, namespace="experiment", name="breast_homo_host",
+ meta=host_meta, head=True, extend_sid=True)
+```
+4. run training example and save pipeline
+
+```python
+from fate_client.pipeline.components.fate import (
+ Reader,
+ Evaluation
+)
+from fate_client.pipeline.components.fate.nn.torch import nn, optim
+from fate_client.pipeline.components.fate.nn.torch.base import Sequential
+from fate_client.pipeline.components.fate.homo_nn import HomoNN, get_config_of_default_runner
+from fate_client.pipeline.components.fate.nn.algo_params import TrainingArguments, FedAVGArguments
+from fate_client.pipeline import FateFlowPipeline
+
+# create pipeline for training, specify corresponding party info
+pipeline = FateFlowPipeline().set_parties(guest="9999", host="10000", arbiter="10000")
+
+# create reader task_desc
+reader_0 = Reader("reader_0", runtime_parties=dict(guest="9999", host="10000"))
+reader_0.guest.task_parameters(namespace="experiment", name="breast_homo_guest")
+reader_0.hosts[0].task_parameters(namespace="experiment", name="breast_homo_host")
+
+# create homo nn component_desc
+epochs = 5
+batch_size = 64
+in_feat = 30
+out_feat = 16
+lr = 0.01
+
+# define nn structure
+conf = get_config_of_default_runner(
+ algo='fedavg',
+ model=Sequential(
+ nn.Linear(in_feat, out_feat),
+ nn.ReLU(),
+ nn.Linear(out_feat ,1),
+ nn.Sigmoid()), 
+ loss=nn.BCELoss(),
+ optimizer=optim.Adam(lr=lr),
+ training_args=TrainingArguments(num_train_epochs=epochs, per_device_train_batch_size=batch_size),
+ fed_args=FedAVGArguments(),
+ task_type='binary')
+
+homo_nn_0 = HomoNN("homo_nn_0", runner_conf=conf,
+ train_data=reader_0.outputs["output_data"],
+ validate_data=reader_0.outputs["output_data"])
+
+# create evaluation component_desc
+evaluation_0 = Evaluation(
+ 'evaluation_0', runtime_parties=dict(guest="9999", host="10000"), metrics=["auc"], input_datas=[homo_nn_0.outputs["train_output_data"]])
+
+# add training task
+pipeline.add_tasks([reader_0, homo_nn_0, evaluation_0])
+
+# compile and train
+pipeline.compile()
+pipeline.fit()
+
+# print metric info
+print (pipeline.get_task_info("evaluation_0").get_output_metric())
+
+# save pipeline for later usage
+pipeline.dump_model("./pipeline.pkl")
+
+```
+
+5. reload trained pipeline and do prediction
+```python
+from fate_client.pipeline import FateFlowPipeline
+from fate_client.pipeline.components.fate import Reader
+
+# create pipeline for predicting
+predict_pipeline = FateFlowPipeline()
+
+# reload trained pipeline
+pipeline = FateFlowPipeline.load_model("./pipeline.pkl")
+
+# deploy task for inference
+pipeline.deploy([pipeline.homo_nn_0])
+
+# add input to deployed_pipeline
+deployed_pipeline = pipeline.get_deployed_pipeline()
+reader_1 = Reader("reader_1", runtime_parties=dict(guest="9999", host="10000"))
+reader_1.guest.task_parameters(namespace="experiment", name="breast_homo_guest")
+reader_1.hosts[0].task_parameters(namespace="experiment", name="breast_homo_host")
+deployed_pipeline.homo_nn_0.test_data = reader_1.outputs["output_data"]
+
+# add task to predict pipeline
+predict_pipeline.add_tasks([reader_1, deployed_pipeline])
+
+# compile and predict
+predict_pipeline.compile()
+predict_pipeline.predict()
+```
+
+6. More tutorials
+More pipeline api guides can be found in this [link](https://github.com/FederatedAI/FATE-Client/blob/main/doc/pipeline.md)