update lr examples for online demo

delta-mpc · Oct 21, 2022 · e4c84d7 · e4c84d7
1 parent 32e8dd6
commit e4c84d7
Show file tree

Hide file tree

Showing 2 changed files with 12 additions and 2 deletions.
diff --git a/jupyter/examples/en-horizontal-logistic-regression-task.ipynb b/jupyter/examples/en-horizontal-logistic-regression-task.ipynb
@@ -50,6 +50,11 @@
     "There're several parts in the PPC Task that need to be programmed by the developer:\n",
     "\n",
     "* ***Task Config***: We can make some basis task config in the ```super().__init__()``` method. The configurations involves task name (```name```), minimum client count(```min_clients```), maximum client count(```max_clients```),waiting timeout for calculation (```wait_timeout```)，and connection timeout for each step in the procedure(```connection_timeout```).\n",
+    "\n",
+    "  And we can start the zero-knownledge proof step to verify the convergence of result and the consistence of data after the task is finished. To start the zero-knownledge proof step, we need to set `enable_verify` to True in `super().__init__()` method. And also, we could control the timeout of zero-knownledge proof step by parameter `verify_timeout`. Now the zero-knownledge proof step consumes pretty long time and the default value of `verify_timeout` is 300 second. If timeout error occurs in the the zero-knownledge proof step, we should set `verify_timeout` to a bigger value.\n",
+    "\n",
+    "  ***We decide to disable the zero-knownledge proof step on online demo system due to the resouce restrictions. You should set `enable_verify` to False on online demo system.***\n",
+    "\n",
     "* ***Dataset***: In the ```dataset``` method, you can specify the dataset for task. The return value is a dict of which key should be the name of dataset and value should be an instance of ```delta.dataset.DataFrame```; the key of dict should be corresponding to the parameters of the execute method. For detailed explanation of the dataset format, please refer to [this document](https://docs.deltampc.com/network-deployment/prepare-data).\n",
     "* ***Preprocess***: In the ```preprocess```, you need to preprocess the dataset, and finally return the x and y for the task. The input parameters should be the same with the keys of returned dict of ```dataset``` method. The returned x and y can be ```pandas.DataFrame``` or ```numpy.ndarray```, and y should be a 1-D array of data labels.\n",
     "* ***Options***: This method is optional. In the ```option``` method, you can specify some options for the logistic regression. The general options are ```method``` (fit method for logistic regression, only `newton` is available now) and `maxiter` (max iterations for fit). The newton method has some specific options, including `ord` (the norm ord for the gradient), `tol` (the stopping tolerance) and `ridge_factor` (the ridge regression factor for the hessian matrices). All these options have default values. You don't need to implment this method unless you have special needs.\n"
@@ -73,7 +78,7 @@
     "            wait_timeout=5,  # Timeout for calculation.\n",
     "            connection_timeout=5,  # Wait timeout for each step.\n",
     "            verify_timeout=500,  # Timeout for the final zero knownledge verification step\n",
-    "            enable_verify=True  # whether to enable final zero knownledge verification step\n",
+    "            enable_verify=False  # whether to enable final zero knownledge verification step\n",
     "        )\n",
     "\n",
     "    def dataset(self):\n",

diff --git a/jupyter/examples/zh-horizontal-logistic-regression-task.ipynb b/jupyter/examples/zh-horizontal-logistic-regression-task.ipynb
@@ -49,6 +49,11 @@
     "在定义横向联邦统计任务时，有几部分内容是需要用户自己定义的：\n",
     "\n",
     "* ***任务配置***: 我们需要在 ```super().__init__()``` 方法中对任务进行配置。 这些配置项包括任务名称（```name```），所需的最少客户端数（```min_clients```），最大客户端数（```max_clients```），等待超时时间（```wait_timeout```，用来控制一轮计算的超时时间），以及连接超时时间（```connection_timeout```，用来控制流程中每个阶段的超时时间）。\n",
+    "\n",
+    "    另外，逻辑回归任务还可以在任务完成后，开启零知识证明阶段，用于验证最终结果的收敛性，以及各个节点计算过程中数据的一致性。如果要开启零知识证明，需要将`super().__init__()`中的`enable_verify`参数设置为True。同时，可以通过`verify_timeout`参数来控制零知识证明阶段的超时时间。目前，零知识证明阶段耗时较长，`verify_timeout`的默认值为300秒，如果在零知识证明阶段发生超时，建议适当加大`verify_timeout`。\n",
+    "\n",
+    "    ***目前线上演示系统由于资源限制，暂不支持开启零知识证明阶段。请将`enable_verify`参数设置为False***\n",
+    "\n",
     "* ***数据集***: 我们需要在```dataset```方法中定义任务所需要的数据集。 该方法返回一个字典，键是数据集的名称，需要与execute方法的参数名对应；对应的值是```delta.dataset.DataFrame```实例， 其参数```dataset```代表所需数据集的名称。关于数据集格式的具体细节，请参考[这篇文章](https://docs.deltampc.com/network-deployment/prepare-data)。\n",
     "* ***预处理***: 在预处理函数中，我们需要对数据集进行处理，最后返回x和y。 输入需要与```dataset```方法的返回值对应，即一个输入形参，对应```dataset```返回的字典中的一项。输出的x和y可以是`pandas.DataFrame`或`numpy.ndarray`，y必须是一个1维的向量，表示类别标签。\n",
     "* ***选项配置***: 这个方法是可选的. 在`options`方法中，我们可以配置逻辑回归训练的一些参数。通用的参数包括 ```method```（逻辑回顾的训练方法，目前只有`newton`可选，即牛顿法）以及`maxiter`（训练的最大迭代次数）。还有一些牛顿法特有的参数, 包括`ord`（梯度范数的阶），`tol`（停止训练的容忍值）以及`ridge_factor`（对黑塞矩阵的脊回归系数）。上述所有的配置项，都有默认值。如果你没有特殊的需求，可以不实现这个方法。\n"
@@ -72,7 +77,7 @@
     "            wait_timeout=5,  # 等待超时时间，用来控制一轮计算的超时时间\n",
     "            connection_timeout=5,  # 连接超时时间，用来控制流程中每个阶段的超时时间\n",
     "            verify_timeout=500,  # 零知识证明步骤的超时时间\n",
-    "            enable_verify=True  # 是否开启零知识证明\n",
+    "            enable_verify=False  # 是否开启零知识证明\n",
     "        )\n",
     "\n",
     "    def dataset(self):\n",