Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Significant Cold Start Initialization Delay in Orchestration Framework #126

Open
karanakatle opened this issue Dec 3, 2024 · 8 comments
Labels
bug Something isn't working triage

Comments

@karanakatle
Copy link

Expected Behaviour

The orchestration framework should initialize with minimal delay during cold starts, ensuring consistent performance across all Lambda invocations, including the first execution.

Current Behaviour

When a new Lambda instance is created, the orchestration framework incurs a significant initialization delay of 5-7 seconds. This delay occurs before any classification or other processing steps, resulting in a total latency of 7-9 seconds for the first invocation.

However, subsequent invocations on the same Lambda instance execute within 1-2 seconds, indicating the issue is specific to cold start initialization. This behavior impacts the overall performance and user experience during the first execution.

Note: Testing with provisioned concurrency (set to 5) does keep 5 instances of the Lambda function warm; however, the issue persists when execution shifts to a new Lambda instance outside of these provisioned instances. The orchestration framework initialization itself takes 5-7 seconds, which adversely impacts performance and user experience during the first execution.

Code snippet

Logs for the initial invocation:

INIT_START Runtime Version: python:3.12.v38	Runtime Version ARN: arn:aws:lambda:us-east-1::runtime:7515e00d6763496e7a147ffa395ef5b0f0c1ffd6064130abb5ecde5a6d630e86
START RequestId: 29a54971-892b-4748-affc-2e4e9a6139db Version: 96
[INFO]	2024-12-02T16:55:56.989Z	29a54971-892b-4748-affc-2e4e9a6139db	Received event in Lambda
[INFO]	2024-12-02T16:55:56.990Z	29a54971-892b-4748-affc-2e4e9a6139db	LLM Classification
[INFO]	2024-12-02T16:55:**57**.513Z	29a54971-892b-4748-affc-2e4e9a6139db	Found credentials in environment variables.
[INFO]	2024-12-02T16:56:**04**.180Z	29a54971-892b-4748-affc-2e4e9a6139db	
** CLASSIFIED INTENT **


Logs for the other invocations in same lambda instance:

[INFO]	2024-12-02T16:57:00.674Z	b6b59439-3a0c-48b5-b763-1e0a2c30ca51	Received event: 
[INFO]	2024-12-02T16:57:**00**.674Z	b6b59439-3a0c-48b5-b763-1e0a2c30ca51	LLM Classification
[INFO]	2024-12-02T16:57:**01**.311Z	b6b59439-3a0c-48b5-b763-1e0a2c30ca51	
** CLASSIFIED INTENT **

Possible Solution

No response

Steps to Reproduce

  1. Deploy the orchestration framework on AWS Lambda.
  2. Trigger a request that leads to the creation of a new Lambda instance.
  3. Measure the total latency for the first invocation (observe 7-9 seconds).
  4. Measure latency for subsequent invocations on the same instance (observe 1-2 seconds).
  5. Test with provisioned concurrency and observe behavior when execution moves to new instances.
@karanakatle karanakatle added the bug Something isn't working label Dec 3, 2024
@github-actions github-actions bot added the triage label Dec 3, 2024
@brnaba-aws
Copy link
Contributor

brnaba-aws commented Dec 3, 2024

Hi @karanakatle ,
thanks for submitting this issue.
There are a couple of things we need to understand before:

  • Can you share the code you used for this lambda?
  • Have you seen that now, you can install the multi-agent-orchestrator with only the minimum required dependency. For instance, if you do not use Anthropic or OpenAi you can simple install multi-agent-orchestrator using: pip install multi-agent-orchestrator this is available from version 0.1.1 This will save you a bit of init time.
  • You can also check the snapstart for python. Which has been released last week.

Let us know if you need further assistance.
regards,
Anthony

@brnaba-aws
Copy link
Contributor

@karanakatle any updates on this?

@brnaba-aws
Copy link
Contributor

@karanakatle , I'm about to close this since I didn't hear anything from you. Let me know if you need further assistance.

@karanashokraokatle
Copy link

karanashokraokatle commented Dec 13, 2024

Hello @brnaba-aws
Sorry for delay response, I tried enabling snapstart and installing the required dependencies only but the issue persists.
On detail analysis, this is what I have found.

  1. Every time the library import takes 2 seconds.
  2. We have used a custom classifier - where we are using our fine tune model for getting a response - it provides a response in 0.6 to 1 sec of time.
  3. If no agent is selected - it goes to fallback agent - which is bedrock agent which again takes 2-3 sec. of time

Is there any chance or way to save the time consuming in 1st and 3rd point

Code is attached for reference.
multi_agent_orchestrator.zip

Cloudwatch logs also attached for analysis
Bot Orchestration Logs.txt

@brnaba-aws
Copy link
Contributor

For:

  1. can you provide a log for more than a single invocation? To see if this time is always there or only on the very first invocation.
  2. You can't really improve that. Unless you don't use a default agent by setting: USE_DEFAULT_AGENT_IF_NONE_IDENTIFIED=False,
    Or use another model for the default agent to be a fast one like claude 3 haiku. I see that your bedrock_llm_agent is using Claude Sonnet 3.5, which is slow.

@brnaba-aws
Copy link
Contributor

@karanakatle , one more thing. I don't think you are using the latest python version since I don't see this try/except with Anthropic: new import method

One thing that I just noticed is the fact that each Lex bot will instantiate a boto3.client('lexv2-runtime', region_name=self.region). We haven't provided a way to reuse a client passed as a parameter.
This would help I believe. I'll create an issue, and provide you with a file to test? ok?

@karanashokraokatle
Copy link

Sure @brnaba-aws , thnx for the help.

@brnaba-aws
Copy link
Contributor

@karanashokraokatle,

could you please try to use this LexBotAgent?
It can accept a client as an option, so you can create a single client and use it across all your lex bot.

example:

lex_client = boto3.client('lexv2-runtime', region_name=os.getenv('AWS_REGION','us-east-1'))

my_agent = LexBotAgent(LexBotAgentOptions(client=lex_client, bot_id = '', bot_alias_id='', locale_id=''))
my_agent_2 = LexBotAgent(LexBotAgentOptions(client=lex_client, bot_id = '', bot_alias_id='', locale_id=''))
from typing import List, Dict, Optional
from dataclasses import dataclass
import boto3
from botocore.exceptions import BotoCoreError, ClientError
from multi_agent_orchestrator.agents import Agent, AgentOptions
from multi_agent_orchestrator.types import ConversationMessage, ParticipantRole
from multi_agent_orchestrator.utils import Logger
import os
from typing import Any

@dataclass
class LexBotAgentOptions(AgentOptions):
    bot_id: str = None
    bot_alias_id: str = None
    locale_id: str = None
    client: Optional[Any] = None

class LexBotAgent(Agent):
    def __init__(self, options: LexBotAgentOptions):
        super().__init__(options)
        if (options.region is None):
            self.region = os.environ.get("AWS_REGION", 'us-east-1')
        else:
            self.region = options.region

        if options.client:
            self.lex_client = options.client

        else:
            self.lex_client = boto3.client('lexv2-runtime', region_name=self.region)

        self.bot_id = options.bot_id
        self.bot_alias_id = options.bot_alias_id
        self.locale_id = options.locale_id

        if not all([self.bot_id, self.bot_alias_id, self.locale_id]):
            raise ValueError("bot_id, bot_alias_id, and locale_id are required for LexBotAgent")

    async def process_request(self, input_text: str, user_id: str, session_id: str,
                        chat_history: List[ConversationMessage],
                        additional_params: Optional[Dict[str, str]] = None) -> ConversationMessage:
        try:
            params = {
                'botId': self.bot_id,
                'botAliasId': self.bot_alias_id,
                'localeId': self.locale_id,
                'sessionId': session_id,
                'text': input_text,
                'sessionState': {}  # You might want to maintain session state if needed
            }

            response = self.lex_client.recognize_text(**params)

            concatenated_content = ' '.join(
                message.get('content', '') for message in response.get('messages', [])
                if message.get('content')
            )

            return ConversationMessage(
                role=ParticipantRole.ASSISTANT.value,
                content=[{"text": concatenated_content or "No response from Lex bot."}]
            )

        except (BotoCoreError, ClientError) as error:
            Logger.error(f"Error processing request: {str(error)}")
            raise error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
Development

No branches or pull requests

3 participants