Skip to content

Commit

Permalink
move redundant data preprocess files
Browse files Browse the repository at this point in the history
  • Loading branch information
ruiyiw committed Nov 6, 2023
2 parents 5b5720c + 6ca3efa commit f7c3186
Show file tree
Hide file tree
Showing 11 changed files with 598 additions and 18 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@ We split our overall framework into multiple parts
2. Together AI Finetuning --> Input the train and test data / Output model checkpoint
3. LLM Finetuning --> Input the train and test data / Output model checkpoint
4. LLM Deplyment --> Input LLM Finetuned model checkpoint / Output Deployable OpenAI type API
5. Eval --> Input model checkpoint / Output evaluation scores
5. Eval --> Input model checkpoint / Output evaluation scores
6. Generate --> Input None / Output new data on redis
74 changes: 59 additions & 15 deletions data_process/redis_data_filtering/prompt_reverse_engineering.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import os
from collections import defaultdict
from typing import Any, Dict, List, Tuple, Union, cast

import transformers
import pandas as pd
import rich
from rich.console import Console
Expand All @@ -15,33 +15,52 @@
import enum

#PROMPT_PREFIX = "Prompt after formatting:\n"

MAX_TOKEN = 2048
PROMPT_TEMPLATE="""Prompt after formatting:\nImagine you are {agent}, your task is to act/speak as {agent} would, keeping in mind {agent}'s social goal.
You can find {agent}'s background and goal in the 'Here is the context of the interaction' field.
Note that {agent}'s secret and goal is only visible to you.
You should try your best to achieve {agent}'s goal in a way that align with their character traits.
Additionally, maintaining the conversation's naturalness and realism is essential (e.g., do not repeat what other people has already said before).
{history}.
You are at Turn #{turn_number}. Your available action types are
{action_list}.
Note: You can "leave" this conversation if 1. you have achieved your social goals, 2. this conversation makes you uncomfortable, 3. you find it uninteresting/you lose your patience, 4. or for other reasons you want to leave.
Please only generate a JSON string including the action type and the argument.
Your action should follow the given format:
{format_instructions}
"""
You are at Turn #{turn_number}."""

#PYDANTIC_FORMAT_INSTRUCTIONS.format(schema=schema_str)
FORMAT_TEMPLATE = """\nAs an example, for the schema {\"properties\": {\"foo\": {\"title\": \"Foo\", \"description\": \"a list of strings\", \"type\": \"array\", \"items\": {\"type\": \"string\"}}}, \"required\": [\"foo\"]}
the object {\"foo\": [\"bar\", \"baz\"]} is a well-formatted instance of the schema. The object {\"properties\": {\"foo\": [\"bar\", \"baz\"]}} is not well-formatted.
\nHere is the output schema:\n```\n{\"description\": \"An interface for messages.\\nThere is only one required method: to_natural_language\", \"properties\": {\"action_type\": {\"title\": \"Action Type\", \"description\": \"whether to speak at this turn or choose to not do anything\", \"enum\": [\"none\", \"speak\", \"non-verbal communication\", \"action\", \"leave\"], \"type\": \"string\"}, \"argument\": {\"title\": \"Argument\", \"description\": \"the utterance if choose to speak, the expression or gesture if choose non-verbal communication, or the physical action if choose action\", \"type\": \"string\"}}, \"required\": [\"action_type\", \"argument\"]}\n```\u001b[0m"""


PROMPT_TEMPLATE_W_FORMAT="""Prompt after formatting:\nImagine you are {agent}, your task is to act/speak as {agent} would, keeping in mind {agent}'s social goal.
You can find {agent}'s background and goal in the 'Here is the context of the interaction' field.
Note that {agent}'s secret and goal is only visible to you.
You should try your best to achieve {agent}'s goal in a way that align with their character traits.
Additionally, maintaining the conversation's naturalness and realism is essential (e.g., do not repeat what other people has already said before).
{history}.
You are at Turn #{turn_number}. Your available action types are
"none action speak non-verbal communication leave".
Note: You can "leave" this conversation if 1. you have achieved your social goals, 2. this conversation makes you uncomfortable, 3. you find it uninteresting/you lose your patience, 4. or for other reasons you want to leave.
Please only generate a JSON string including the action type and the argument.
Your action should follow the given format:
\nAs an example, for the schema {\"properties\": {\"foo\": {\"title\": \"Foo\", \"description\": \"a list of strings\", \"type\": \"array\", \"items\": {\"type\": \"string\"}}}, \"required\": [\"foo\"]}
the object {\"foo\": [\"bar\", \"baz\"]} is a well-formatted instance of the schema. The object {\"properties\": {\"foo\": [\"bar\", \"baz\"]}} is not well-formatted.
\nHere is the output schema:\n```\n{\"description\": \"An interface for messages.\\nThere is only one required method: to_natural_language\", \"properties\": {\"action_type\": {\"title\": \"Action Type\", \"description\": \"whether to speak at this turn or choose to not do anything\", \"enum\": [\"none\", \"speak\", \"non-verbal communication\", \"action\", \"leave\"], \"type\": \"string\"}, \"argument\": {\"title\": \"Argument\", \"description\": \"the utterance if choose to speak, the expression or gesture if choose non-verbal communication, or the physical action if choose action\", \"type\": \"string\"}}, \"required\": [\"action_type\", \"argument\"]}\n```\u001b[0m
"""
# static
ACTION_LIST = "none action speak non-verbal communication leave" #" ".join(ActionType)

ACTION_REVERSE_MAP = {"left ": "leave", 'did n': 'none', 'said:': 'speak'}

MODEL_CHECKPOINT = "meta-llama/Llama-2-13b-chat-hf"
HF_TOKEN = "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"


TOKENIZER = transformers.AutoTokenizer.from_pretrained(
MODEL_CHECKPOINT,
padding = False,
truncation = False,
token=HF_TOKEN,
)

def to_natural_language(self) -> str:
match self.action_type:
Expand Down Expand Up @@ -101,10 +120,27 @@ def generate_result(msg):

return str_result

def reverse_episode_log(epilog, later_speak=False):
def surpass_max_token_check(string, max_token=MAX_TOKEN, tokenizer=TOKENIZER):
prompt_tokens = len(tokenizer(string)['input_ids'])
return max(prompt_tokens - max_token, 0)

def truncate_prompt_to_length(dia_his, surpass_num, tokenizer=TOKENIZER):
# context_len = len(tokenizer(context)['input_ids'])
dia_sen = dia_his.split("\n")
remove_len = 0
i = 0
while remove_len < surpass_num:
remove_len+=len(tokenizer(dia_sen[i])['input_ids'])
i+=1
trunc_dia = "\n".join(p for p in dia_sen[i:])
return trunc_dia


def reverse_episode_log(epilog, later_speak=False, include_format=False, max_token=MAX_TOKEN):
episode_msg = epilog.messages
# per episode
agent_model = epilog.models[1]
promt_template = PROMPT_TEMPLATE_W_FORMAT if include_format else PROMPT_TEMPLATE

if len(episode_msg) > 0:
init_loop = episode_msg[0]
Expand All @@ -131,23 +167,31 @@ def reverse_episode_log(epilog, later_speak=False):
dial_history += "\n"+tpl[2]
else:
# for the first context, we don't need \n
dial_history += tpl[2]
context = tpl[2]
dial_history += context

if tpl[0] == speaker: # if speaker is the agent, use what he said as result
str_result = generate_result(tpl[2])
# check if this is the end
if i%2 == turn_div:
# take alternative turns as we always want to predict one agent, not both
next_turn = i
prompt = PROMPT_TEMPLATE.format(
agent=speaker, history=dial_history, turn_number=next_turn,
action_list=ACTION_LIST, format_instructions=FORMAT_TEMPLATE)
prompt = promt_template.format(
agent=speaker, history=dial_history, turn_number=next_turn)
over_tokens = surpass_max_token_check(prompt)
if over_tokens > 0:
all_dial = dial_history[len(context):]
#print(all_dial)
trun_dial = truncate_prompt_to_length(all_dial, over_tokens)
prompt = promt_template.format(
agent=speaker, history=context+"\n"+trun_dial, turn_number=next_turn)
turn_dic["prompt"] = prompt
turn_dic['result'] = str_result
prompt_result_instances.append(turn_dic)

return prompt_result_instances


def parse_prompt_to_json(episode, dir, init_speak):
prompt_result_instances = reverse_episode_log(episode, init_speak)

Expand Down
2 changes: 1 addition & 1 deletion data_process/redis_data_filtering/redis_filtering.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ def goal_filter_per_env_agent(episodes):
env_tpls.append((episodes[agent1_rank[i]], 0))
env_tpls.append((episodes[agent2_rank[i]], 1))
else:
if goal_score['agent1'][agent1_rank[i]] >= min(GOAL_KEEP_THRESHOD, agent1_avg) and (goal_score['agent2'][agent2_rank[i]] >= min(KEEP_THRESHOD, agent2_avg)):
if goal_score['agent1'][agent1_rank[i]] >= min(GOAL_KEEP_THRESHOD, agent1_avg) and (goal_score['agent2'][agent2_rank[i]] >= min(GOAL_KEEP_THRESHOD, agent2_avg)):
env_tpls.append((episodes[agent1_rank[i]], 0))
env_tpls.append((episodes[agent1_rank[i]], 1))

Expand Down
170 changes: 169 additions & 1 deletion llm_deploy/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,173 @@
## Deploy lora-finetuned model using vLLM variance

We need to use an unmerged branch to support deploying lora-finetuned model. (the forked repo is https://github.com/troph-team/vllm.git)

Go to the vllm dir and pip install -e .

To notice https://github.com/vllm-project/vllm/issues/1283, need to modify the config file to "== 2.0.1" and the pytorch version if facing with CUDA version error.
To notice https://github.com/vllm-project/vllm/issues/1283, need to modify the config file to "== 2.0.1" and the pytorch version if facing with CUDA version error.


## Setting up Babel server
### Login with SSH key
Add public ed25519 key to server
```bash
ssh-copy-id -i ~/.ssh/id_ed25519.pub <username>@<mycluster>
```
Config SSH file
```bash
Host <mycluster>
HostName <mycluster>
User <username>
IdentityFile ~/.ssh/id_ed25519
```
Login babel with SSH key
```bash
ssh <mycluster>
```

### Connecting to a compute node
Jump from login node to compute node
```bash
srun --pty bash
```
Check if you can access the /data/folder
```bash
cd /data/datasets/
```

### Config environment on the compute node
Install miniconda
```bash
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
conda init
conda create --name myenv
conda activate myenv
# conda deactivate
```
Install vllm packages
```bash
conda install pip
pip install vllm
```
Install fastchat packages
```bash
conda install pip
git clone https://github.com/lm-sys/FastChat.git
cd FastChat
pip3 install --upgrade pip
pip3 install "fschat[model_worker,webui]"
```
Submit gpu request and open a an interactive terminal
```bash
srun --gres=gpu:1 --time=1-00:00:00 --mem=80G --pty $SHELL
conda activate myenv
```
Some useful commands for checking gpu jobs
```bash
# check slurm status
squeue -l
# check gpu status
nvidia-smi
# check gpu usage
pip install gpustat
watch -n 1 gpustat
# quit slurm jobs
scancel job_id
# connect to compute node directly
ssh -J babel babel-x-xx
```

### Install cuda-toolkit (optional)
Due to the issue with vllm: https://github.com/vllm-project/vllm/issues/1283, we need to use cuda-toolkit=11.7.0 that is compatible with Pytorch 2.0.1.
Install cuda-toolkit=11.7.0 on conda environment
```bash
conda install -c "nvidia/label/cuda-11.7.0" cuda-toolkit
```
Check cuda-toolkit version
```bash
nvcc -V
```

## Deploy models on Babel via FastChat API server
Implement the following python commands in three separate interactive terminal windows:
```bash
python3 -m fastchat.serve.controller
python3 -m fastchat.serve.model_worker --model-path model-checkpoint
python3 -m fastchat.serve.openai_api_server --host localhost --port 8000
```
Call model checkpoint API
```bash
curl http://localhost:8000/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "model-checkpoint",
"prompt": "San Francisco is a",
"max_tokens": 7,
"temperature": 0
}'
```
*Sample output:*
```JSON
{"id":"cmpl-GGvKBiZFdFLzPq2HdtuxbC","object":"text_completion","created":1698692212,"model":"checkpoint-4525","choices":[{"index":0,"text":"city that is known for its icon","logprobs":null,"finish_reason":"length"}],"usage":{"prompt_tokens":5,"total_tokens":11,"completion_tokens":6}}
```

## Deploy models on Babel via vllm API server
Start vLLM surver with model checkpoint
```bash
python -m vllm.entrypoints.openai.api_server --model model_checkpoint/
```
Call model checkpoint API
```bash
curl http://localhost:8000/v1/models
```
*Sample output:*
```JSON
{"object":"list","data":[{"id":"Mistral-7B-Instruct-v0.1/","object":"model","created":1697599903,"owned_by":"vllm","root":"Mistral-7B-Instruct-v0.1/","parent":null,"permission":[{"id":"modelperm-d415ecf6362a4f818090eb6428e0cac9","object":"model_permission","created":1697599903,"allow_create_engine":false,"allow_sampling":true,"allow_logprobs":true,"allow_search_indices":false,"allow_view":true,"allow_fine_tuning":false,"organization":"*","group":null,"is_blocking":false}]}]}
```
Inference model checkpoint API
```bash
curl http://localhost:8000/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "model_checkpoint",
"prompt": "San Francisco is a",
"max_tokens": 7,
"temperature": 0
}'
```
*Sample output:*
```JSON
{"id":"cmpl-bf7552957a8a4bd89186051c40c52de4","object":"text_completion","created":3600699,"model":"Mistral-7B-Instruct-v0.1/","choices":[{"index":0,"text":" city that is known for its icon","logprobs":null,"finish_reason":"length"}],"usage":{"prompt_tokens":5,"total_tokens":12,"completion_tokens":7}}
```

## Access deployed Babel server on a local machine
Construct ssh tunnel between babel login node and babel compute node with hosted model
```bash
ssh -N -L 7662:localhost:8000 username@babel-x-xx
```
The above command creates a localhost:7662 server on bable login node which connects to localhost:8000 on compute node.

Construct ssh tunnel between local machine and babel login node
```bash
ssh -N -L 8001:localhost:7662 username@<mycluster>
```
The above command creates a localhost:8001 server on your local machine which connects to localhost:7662 on babel login node.

Call hosted model on local machine
```bash
curl http://localhost:8001/v1/models
```
If the above command runs successfully, you should be able to use REST API on your local machine.

(optional) If you fail in building the ssh tunnel, you may add `-v` to the ssh command to see what went wrong.




## Userful resource links for babel
1. https://hpc.lti.cs.cmu.edu/wiki/index.php?title=BABEL#Cluster_Architecture
2. https://hpc.lti.cs.cmu.edu/wiki/index.php?title=VSCode
3. https://hpc.lti.cs.cmu.edu/wiki/index.php?title=Training_Material
4. https://hpc.lti.cs.cmu.edu/wiki/index.php?title=Connecting_to_the_Cluster#Copying_Data_to_Compute_Nodes

9 changes: 9 additions & 0 deletions llm_generate/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Data Generation

For the first step, we generate envProfile (including scenario / social goal / relationship restriction) based on inspiring prompt.

For the second step, we put the original agentProfile and relationshipProfile into our new redis database

For the third step, we combine them together to be combos based on conditiona sampling (the restriction is the relationship)

All the EnvProfile (new generated), AgentProfile (sotopia original), RelationshipProfile (sotopia original), and envagentcombo are on the redis database that is new created.
Loading

0 comments on commit f7c3186

Please sign in to comment.