-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add: Cloud-edge collaborative inference for LLM based on KubeEdge-Ianvs #122
Conversation
Signed-off-by: Yu Fan <fany@buaa.edu.cn>
Signed-off-by: Yu Fan <fany@buaa.edu.cn>
Signed-off-by: Yu Fan <fany@buaa.edu.cn>
...hms/cloud-edge-collaboration-inference-for-llm/cloud-edge-collaboration-inference-for-llm.md
Outdated
Show resolved
Hide resolved
The PR needs to add the design of how to implement this feature in Ianvs, especially to ensure that the new interface is consistent with the interface in Sedna. |
Signed-off-by: Yu Fan <fany@buaa.edu.cn>
Thanks to @hsj576 and @MooreZheng for the review. While trying to improve my proposal, I found that the integration of the new algorithm still needs to be discussed. Basicly, there are two possible plans to integrate the cloud-edge collaborative strategy for LLMs.
In our last two meetings, we had some discussions about implementation details. From my personal understanding, @MooreZheng appears to prefer using Plan A to integrate Sedna's JointInference interface for alignment, while @hsj576’s initial idea aligns more with Plan B, which leaves the collaborative strategy to be implemented in However, these two approaches differ significantly, and I would like to seek advice on which integration method should be adopted. If my understanding is incorrect, please feel free to point it out directly. In the following part, I will present detailes for plan A and plan B, highlighting their respective advantages and disadvantages. However, since we are designing an new algorithm, it's essential to have a thorough understanding of the code for both Sedna and Ianvs. I will first introduce the implementation logic and existing issues of JointInference in Sedna, followed by a review of the reasoning behind Ianvs introducing the new algorithm paradigm. 1 Sedna's JointInference1.1 Implemetation DetailThe core class of the Joint Inference feature in Sedna is class JointInference(JobBase): which mainly has an 1.1.1
|
job = BenchmarkingJob(config[str.lower(BenchmarkingJob.__name__)]) | |
job.run() |
The specific logic of BenchmarkJob
is defined in ianvs/cmd/obj/benchmarkingjob.py
. We can see that the testcase is created by the build_testcases()
method of testcasecontroller
:
ianvs/core/cmd/obj/benchmarkingjob.py
Lines 91 to 92 in 4de73b2
self.testcase_controller.build_testcases(test_env=self.test_env, | |
test_object=self.test_object) |
build_testcases()
constructs the algorithm through the _parse_algorithms_config()
function:
ianvs/core/testcasecontroller/testcasecontroller.py
Lines 34 to 44 in 4de73b2
def build_testcases(self, test_env, test_object): | |
""" | |
Build multiple test cases by Using a test environment and multiple test algorithms. | |
""" | |
test_object_type = test_object.get("type") | |
test_object_config = test_object.get(test_object_type) | |
if test_object_type == TestObjectType.ALGORITHMS.value: | |
algorithms = self._parse_algorithms_config(test_object_config) | |
for algorithm in algorithms: | |
self.test_cases.append(TestCase(test_env, algorithm)) |
The _parse_algorithms_config()
function has a line indicating the instantiation of the Algorithm
class:
algorithm = Algorithm(name, config) |
In ianvs/core/testcasecontroller/algorithm/algorithm.py
, the Algorithm
class is declared, which will return various algorithm instances in paradigm function:
ianvs/core/testcasecontroller/algorithm/algorithm.py
Lines 95 to 105 in 4de73b2
if self.paradigm_type == ParadigmType.SINGLE_TASK_LEARNING.value: | |
return SingleTaskLearning(workspace, **config) | |
if self.paradigm_type == ParadigmType.INCREMENTAL_LEARNING.value: | |
return IncrementalLearning(workspace, **config) | |
if self.paradigm_type == ParadigmType.MULTIEDGE_INFERENCE.value: | |
return MultiedgeInference(workspace, **config) | |
if self.paradigm_type == ParadigmType.LIFELONG_LEARNING.value: | |
return LifelongLearning(workspace, **config) |
These algorithm instances are defined in the files located in various folders under ianvs/core/testcasecontroller/algorithm/paradigm
. These algorithm classes inherit from the ParadigmBase
class and are required to have a run()
function, which serves as the main entry point for training and inference.
Taking IncrementalLearning as an example, it is defined by the IncrementalLearning
class in ianvs/core/testcasecontroller/algorithm/paradigm/incremental_learning/incremental_learning.py
.
For inference tasks, IncrementalLearning first uses the build_paradigm_job()
function to construct a job, and then calls the job's inference()
interface to complete the inference.
ianvs/core/testcasecontroller/algorithm/paradigm/incremental_learning/incremental_learning.py
Lines 132 to 139 in 4de73b2
job = self.build_paradigm_job(ParadigmType.INCREMENTAL_LEARNING.value) | |
inference_dataset = self.dataset.load_data(data_index_file, "inference") | |
inference_dataset_x = inference_dataset.x | |
inference_results = {} | |
hard_examples = [] | |
for _, data in enumerate(inference_dataset_x): | |
res, _, is_hard_example = job.inference([data]) |
build_paradigm_job()
is defined in ianvs/core/testcasecontroller/algorithm/paradigm/base.py
by ParadigmBase
:
ianvs/core/testcasecontroller/algorithm/paradigm/base.py
Lines 93 to 129 in 4de73b2
if paradigm_type == ParadigmType.SINGLE_TASK_LEARNING.value: | |
return self.module_instances.get(ModuleType.BASEMODEL.value) | |
if paradigm_type == ParadigmType.INCREMENTAL_LEARNING.value: | |
return IncrementalLearning( | |
estimator=self.module_instances.get(ModuleType.BASEMODEL.value), | |
hard_example_mining=self.module_instances.get( | |
ModuleType.HARD_EXAMPLE_MINING.value)) | |
if paradigm_type == ParadigmType.LIFELONG_LEARNING.value: | |
return LifelongLearning( | |
estimator=self.module_instances.get( | |
ModuleType.BASEMODEL.value), | |
task_definition=self.module_instances.get( | |
ModuleType.TASK_DEFINITION.value), | |
task_relationship_discovery=self.module_instances.get( | |
ModuleType.TASK_RELATIONSHIP_DISCOVERY.value), | |
task_allocation=self.module_instances.get( | |
ModuleType.TASK_ALLOCATION.value), | |
task_remodeling=self.module_instances.get( | |
ModuleType.TASK_REMODELING.value), | |
inference_integrate=self.module_instances.get( | |
ModuleType.INFERENCE_INTEGRATE.value), | |
task_update_decision=self.module_instances.get( | |
ModuleType.TASK_UPDATE_DECISION.value), | |
unseen_task_allocation=self.module_instances.get( | |
ModuleType.UNSEEN_TASK_ALLOCATION.value), | |
unseen_sample_recognition=self.module_instances.get( | |
ModuleType.UNSEEN_SAMPLE_RECOGNITION.value), | |
unseen_sample_re_recognition=self.module_instances.get( | |
ModuleType.UNSEEN_SAMPLE_RE_RECOGNITION.value) | |
) | |
# pylint: disable=E1101 | |
if paradigm_type == ParadigmType.MULTIEDGE_INFERENCE.value: | |
return self.modules_funcs.get(ModuleType.BASEMODEL.value)() | |
return None |
It can be seen that the IncrementalLearning class calls the IncrementalLearning
interface from sedna.core.incremental_learning
when constructing tasks, and inference()
is provided by this algorithm in sedna.
During the instantiation of the sedna IncrementalLearning
class, two parameters, estimator
and hard_example_mining
, are passed in. These two parameters are registered and instantiated by the BaseModel
class from basemodel.py
and the class in hard_example_mining.py
implemented in examples/pcb-aoi/incremental_learning_bench/fault_detection/testalgorithms/fpn
.
ianvs/examples/pcb-aoi/incremental_learning_bench/fault detection/testalgorithms/fpn/basemodel.py
Lines 61 to 62 in 4de73b2
@ClassFactory.register(ClassType.GENERAL, alias="FPN") | |
class BaseModel: |
Lines 49 to 50 in 4de73b2
@ClassFactory.register(ClassType.HEM, alias="IBT") | |
class IBTFilter(BaseFilter, abc.ABC): |
In the process of the IncrementalLearning
class in Sedna, the inference of the estimator and hard example mining will be completed sequentially, just like in Sedna JointInference.
3 Two Plans to integrate JointInference to Ianvs
Due to the current implementation flaws of JointInference in Sedna, considering Ianvs's own logic for integrating new algorithms, there are two possible ways to introduce JointInference.
3.1 Plan A
In this approach, similar to the integration of Incremental Learning and LifeLongLearning, we will build jobs based on Sedna's JointInference class. Demo Code
if paradigm_type == ParadigmType.JOINT_INFERENCE.value:
return JointInference(
estimator=self.module_instances.get(
ModuleType.BASEMODEL.value),
cloud=self.module_instances.get(
ModuleType.APIMODEL.value),
hard_example_mining=self.module_instances.get(
ModuleType.HARD_EXAMPLE_MINING.value)
)
This method requires some modifications to Sedna JointInference to address the two problems we previously mentioned:
- Introduce a new parameter
cloud
to the constructor of the JointInference class, allowing users to pass a self-designed model client instance. This resolves the issue where LLM currently needs to build unnecessary forwarding service. Since cloud is an optional parameter, it will not affect existed joint inference examples in Sedna. Demo Code
def __init__(self, estimator=None, cloud=None, hard_example_mining: dict = None):
super(JointInference, self).__init__(estimator=estimator)
- Introduce a parameter
mining_mode
for theinference()
function of the JointInference class. Based on different mining modes, construct corresponding processes to address the issue of current collaborative processes not being compatible with LLMs. I have implemented a simple demo that builds two types, "inference-then-mining" for Object Detection collaborative strategy and "mining-then-inference" for LLM query routing strategy. Demo Code
mining_mode = kwargs.get("mining_mode", "inference-then-mining")
if mining_mode == "inference-then-mining":
res, edge_result = self._get_edge_result(data, callback_func, **kwargs)
if self.hard_example_mining_algorithm is None:
raise ValueError("Hard example mining algorithm is not set.")
is_hard_example = self.hard_example_mining_algorithm(res)
if is_hard_example:
res, cloud_result = self._get_cloud_result(data, post_process=post_process, **kwargs)
elif mining_mode == "mining-then-inference":
# First conduct hard example mining, and then decide whether to execute on the edge or in the cloud.
if self.hard_example_mining_algorithm is None:
raise ValueError("Hard example mining algorithm is not set.")
is_hard_example = self.hard_example_mining_algorithm(res)
if is_hard_example:
if not sepeculative_decoding:
res, cloud_result = self._get_cloud_result(data, post_process=post_process, **kwargs)
else:
# do speculative_decoding
pass
else:
res, edge_result = self._get_edge_result(data, callback_func, **kwargs)
else:
raise ValueError(
"Mining Mode must be in ['mining-then-inference', 'inference-then-mining']"
)
return [is_hard_example, res, edge_result, cloud_result]
3.1.1 Advantages
- Introduce Sedna's JointInference feature to Ianvs core, and may also be contributed to Sedna to support LLM joint inference.
3.1.2 Drawbacks
- The collaborative strategy will be hard-coded into the core. If users want to implement a new collaborative themselves, they will need to modify ianvs kernel (however, we can consider adding a custom module, similar to BASEMODEL and UNSEEN_TASK_ALLOCATION, allowing users to define it in examples).
3.2 Plan B
In addition to paradigms like Incremental Learning and Lifelong Learning that directly call Sedna's interfaces, Ianvs also includes paradigms such as Single Task Learning, which do not include any details about training and inference; those details are entirely handled by the BasedModel
class found in the examples/
directory, which is as follows:
ianvs/core/testcasecontroller/algorithm/paradigm/base.py
Lines 93 to 94 in 4de73b2
if paradigm_type == ParadigmType.SINGLE_TASK_LEARNING.value: | |
return self.module_instances.get(ModuleType.BASEMODEL.value) |
In Plan B, we can introduce JointInference in a manner similar to SingleTaskLearning. It does not include any specific strategy for collaborative algorithms. The newly added JointInference
paradigm serves merely as an entry point for calling a collaborative algorithm, and its code is almost identical to that of SingleTaskLearning
but without training step. The specific collaborative strategy will be written in the predict()
function of the BaseModel
found in examples/
.
3.2.1 Advantages
- User-defined collaborative algorithms will be more convenient, allowing for direct customization of the collaboration process within the example.
3.2.2 Drawbacks
- Since the newly added JointInference class in the paradigm does not contain any details of collaborative algorithms, whether it can be considered a new feature of Ianvs needs to be discussed;
- Because the details of collaborative algorithms are entirely implemented by users, the introduction of Sedna's JointInference has become unnecessary.
Which plan should we adopt? If there are any other options, please feel free to suggest them.
I prefer plan A. The overall looks good to me. |
… fix math typo Signed-off-by: Yu Fan <fany@buaa.edu.cn>
Considering that Plan A aligns better with the Sedna interface and existing algorithms, I have updated the proposal based on this plan and created some draft code. You can find more details by clicking these links: proposal and draft code. |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design looks good to me already. A bland new feature of joint inference is added to ianvs.
It would be further appreciated if there are any thoughts about the previously mentioned "adding a custom module, similar to BASEMODEL and UNSEEN_TASK_ALLOCATION, allowing users to define it in examples".
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hsj576, MooreZheng The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Cloud-edge collaborative inference for LLM based on KubeEdge-Ianvs proposal
What type of PR is this?
/kind design
What this PR does / why we need it:
The PR is a proposal to add cloud-edge collaborative inference algorithm for LLM to combine the advantages of high inference accuracy of cloud LLM with strong privacy and fast inference of edge LLM through the strategy of cloud edge collaboration, so as to better meet the needs of edge users.
Which issue(s) this PR fixes:
#96