Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script for LLM benchmark - Humaneval #61

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

bhanvimenghani
Copy link

This pr contains a wip script for evaluating llm benchamrk.

@bhanvimenghani
Copy link
Author

@bhanvimenghani bhanvimenghani marked this pull request as ready for review July 18, 2024 13:00
@bhanvimenghani bhanvimenghani changed the title [WIP] Script for LLM benchmark - Humaneval Script for LLM benchmark - Humaneval Jul 19, 2024
@@ -0,0 +1,27 @@
# How to Run the LLM benchmark - Inference workload
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add some description of the workload and references to it


## 1. Using the Workbench
## Pre-requisite
As of now the user needs to have access to openshift AI cluster, create a Data Science project and setup a workbench with the following configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on openshift AI cluster. For example, an OCP cluster with GPU nodes and 'xyz' operator installed

Towards the end of the script B the user is prompted to fill the run duration, which is for how long the user wants to apply load or keep the model running, by default its set to 1hr (3600 sec).

## 2. Automated Job
In this approach we already have a combined script named `script.py`, and a Docker file which is used to create this docker image `quay.io/kruizehub/human-eval-deployment` whcih is used in the `job.yaml` file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix whcih typo

Towards the end of the script B the user is prompted to fill the run duration, which is for how long the user wants to apply load or keep the model running, by default its set to 1hr (3600 sec).

## 2. Automated Job
In this approach we already have a combined script named `script.py`, and a Docker file which is used to create this docker image `quay.io/kruizehub/human-eval-deployment` whcih is used in the `job.yaml` file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the namespace created as part of the job or user needs to create it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my case it was already created

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to update the readme to create the namespace

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added! can u re review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants