Script for LLM benchmark - Humaneval #61

bhanvimenghani · 2024-06-26T08:40:49Z

This pr contains a wip script for evaluating llm benchamrk.

bhanvimenghani · 2024-07-18T11:27:09Z

demo link https://www.loom.com/share/cdc5d026374c43efaed54fc6630f5c5b?sid=7f4d8e8e-123e-4ec3-bb0c-7202a9daf700

chandrams · 2024-07-23T11:06:16Z

human-eval-benchmark/Readme.md

@@ -0,0 +1,27 @@
+# How to Run the LLM benchmark - Inference workload


Can you please add some description of the workload and references to it

chandrams · 2024-07-23T11:18:51Z

human-eval-benchmark/Readme.md

+
+## 1. Using the Workbench 
+## Pre-requisite 
+As of now the user needs to have access to openshift AI cluster, create a Data Science project and setup a workbench with the following configuration 


Can you elaborate on openshift AI cluster. For example, an OCP cluster with GPU nodes and 'xyz' operator installed

chandrams · 2024-07-23T11:20:05Z

human-eval-benchmark/Readme.md

+Towards the end of the script B the user is prompted to fill the run duration, which is for how long the user wants to apply load or keep the model running, by default its set to 1hr (3600 sec). 
+
+## 2. Automated Job 
+In this approach we already have a combined script named `script.py`, and a Docker file which is used to create this docker image `quay.io/kruizehub/human-eval-deployment` whcih is used in the `job.yaml` file. 


Fix whcih typo

chandrams · 2024-07-23T11:25:38Z

human-eval-benchmark/Readme.md

+Towards the end of the script B the user is prompted to fill the run duration, which is for how long the user wants to apply load or keep the model running, by default its set to 1hr (3600 sec). 
+
+## 2. Automated Job 
+In this approach we already have a combined script named `script.py`, and a Docker file which is used to create this docker image `quay.io/kruizehub/human-eval-deployment` whcih is used in the `job.yaml` file. 


Is the namespace created as part of the job or user needs to create it?

In my case it was already created

You need to update the readme to create the namespace

added! can u re review

wip scripts

180c261

bhanvimenghani requested a review from kusumachalasani June 26, 2024 08:40

bhanvimenghani self-assigned this Jun 26, 2024

runs the workload as per user input

1b408e0

bhanvimenghani force-pushed the llm-script branch from 6c48b2e to 1b408e0 Compare July 8, 2024 10:41

automates human-eval benchmark via job

9721148

puts notebooks in folder

65e6dfa

bhanvimenghani marked this pull request as ready for review July 18, 2024 13:00

fixes env variable code

0d01f84

bhanvimenghani force-pushed the llm-script branch from 54666a7 to 0d01f84 Compare July 19, 2024 05:29

updates readme

7cb9bf9

bhanvimenghani changed the title ~~[WIP] Script for LLM benchmark - Humaneval~~ Script for LLM benchmark - Humaneval Jul 19, 2024

bhanvimenghani requested a review from dinogun July 19, 2024 07:19

chandrams reviewed Jul 23, 2024

View reviewed changes

accomodates review comments

81fd558

bhanvimenghani requested a review from chandrams July 23, 2024 12:57

updates readme

b407ab7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Script for LLM benchmark - Humaneval #61

Script for LLM benchmark - Humaneval #61

bhanvimenghani commented Jun 26, 2024

bhanvimenghani commented Jul 18, 2024

chandrams Jul 23, 2024

chandrams Jul 23, 2024

chandrams Jul 23, 2024

chandrams Jul 23, 2024

bhanvimenghani Jul 23, 2024

chandrams Jul 23, 2024

bhanvimenghani Jul 24, 2024

		@@ -0,0 +1,27 @@
		# How to Run the LLM benchmark - Inference workload

Script for LLM benchmark - Humaneval #61

Are you sure you want to change the base?

Script for LLM benchmark - Humaneval #61

Conversation

bhanvimenghani commented Jun 26, 2024

bhanvimenghani commented Jul 18, 2024

chandrams Jul 23, 2024

Choose a reason for hiding this comment

chandrams Jul 23, 2024

Choose a reason for hiding this comment

chandrams Jul 23, 2024

Choose a reason for hiding this comment

chandrams Jul 23, 2024

Choose a reason for hiding this comment

bhanvimenghani Jul 23, 2024

Choose a reason for hiding this comment

chandrams Jul 23, 2024

Choose a reason for hiding this comment

bhanvimenghani Jul 24, 2024

Choose a reason for hiding this comment