Skip to content

Commit

Permalink
Sync private repo with this public repo (#12)
Browse files Browse the repository at this point in the history
  • Loading branch information
ProKil authored Jan 7, 2024
1 parent 696e07e commit 0c612b2
Show file tree
Hide file tree
Showing 34 changed files with 6,010 additions and 194 deletions.
22 changes: 6 additions & 16 deletions .github/workflows/mypy.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,5 @@
name: Mypy
on:
push:
branches:
- main
pull_request:
branches:
- main
on: [push]

jobs:
Static-Type-Checking:
Expand All @@ -21,15 +15,11 @@ jobs:
python-version: 3.11.2
- name: Install dependencies
run: |
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .[dev]
curl -sSL https://install.python-poetry.org | python3
poetry install --all-extras
- name: Type-checking package with mypy
run: |
# Manually install mypy in the standard way.
pip --quiet install -U mypy
# Log this mypy version for debuggability.
mypy --version
# Run this mypy instance against our main package.
mypy --install-types --non-interactive sotopia
mypy --strict .
poetry run pip install types-protobuf==4.24.0.4
poetry run mypy --install-types --non-interactive sotopia
poetry run mypy --strict .
15 changes: 4 additions & 11 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,5 @@
name: Pytest
on:
push:
branches:
- main
pull_request:
branches:
- main
on: [push]

jobs:
Pytest:
Expand All @@ -21,13 +15,12 @@ jobs:
python-version: 3.11.2
- name: Install dependencies
run: |
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .[dev]
curl -sSL https://install.python-poetry.org | python3
poetry install --all-extras
- name: Test with pytest
env: # Or as an environment variable
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
REDIS_OM_URL: ${{ secrets.REDIS_OM_URL }}
TOGETHER_API_KEY: ${{ secrets.TOGETHER_API_KEY }}
run: |
pytest
poetry run pytest
12 changes: 9 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,12 @@


## Installation
This package supports Python 3.11 and above. In one line,
`pip install sotopia`.

This package supports Python 3.11 and above. We recommend using a virtual environment to install this package, e.g. with anaconda3: `conda create -n sotopia python=3.11; conda activate sotopia; conda install -c conda-forge pip`. Then, install the requirements and this package.
Or from scratch, use a virtual environment, e.g. with anaconda3: `conda create -n sotopia python=3.11; conda activate sotopia; curl -sSL https://install.python-poetry.org | python3`. Then, install the requirements and this package.
```bash
python -m pip install -r requirements.txt # make sure the packages are installed in the specific conda environment
python -m pip install -e .
poetry install
```

OpenAI key is required to run the code. Please set the environment variable `OPENAI_API_KEY` to your key. The recommend way is to add the key to the conda environment:
Expand All @@ -32,6 +33,11 @@ A redis-stack server is required to run the code. Please follow the [instruction
conda env config vars set REDIS_OM_URL="redis://user:password@host:port"
```

Make a folder to store the logs:
```bash
mkdir logs
```

## Easy Sample Server
You can view an episode demo with default parameters using the following command:
```python
Expand Down
5 changes: 5 additions & 0 deletions docs/all_the_issues.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Come here if you encounter any issues

## Missing episodes

Large batch size may cause some episodes to be skipped. This is due to the fact that the server may not be able to handle the load. Try reducing the batch size. But you can also use the script in `examples/fix_missing_episodes.py` to fix the missing episodes.
12 changes: 12 additions & 0 deletions docs/examples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Example Scripts For Using The Library

## Example 1: Evaluating existing episodes

```python
python examples/evaluate_existing_episodes.py --tag=<tag to upload to the database> --model=<the model used to re-evaluate the existing episodes> --batch_size=<batch size used for evaluation> --push-to-db
```

Run ```python examples/evaluate_existing_episodes.py --help``` for more information.

## Example 2: Generate script-like episodes
See `docs/simulation_modes.md` for more information.
6 changes: 6 additions & 0 deletions docs/hyperparameters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Hyperparameters that are used in the simulation

## Tags

- `TAG`: The tag of the simulation. This tag is used to identify the simulation in the database.
- `TAG_TO_CHECK_EXISTING_EPISODES`: Scripts like `examples/experiment_eval.py` checks if there are existing episodes with the same tag in the database. If there are, the simulation **will not** be run. This is to avoid running the same simulation twice. If you want to run the simulation again, you can change the tag or set `TAG_TO_CHECK_EXISTING_EPISODES` to `None`.
45 changes: 45 additions & 0 deletions docs/simulation_modes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Different Modes of Simulation

## Simulation Modes

The simulation can be run in different modes. The mode is specified in the configuration file. The following modes are available:

### Sotopia-lite

- `lite`: The simulation runs without characters' detailed background information but just names. To use this mode, set `lite` to `True` in the gin configuration command.
e.g.,
```bash
python examples/experiment_eval.py \
--gin_file sotopia_conf/generation_utils_conf/generate.gin \
--gin_file sotopia_conf/server_conf/server.gin \
--gin_file sotopia_conf/run_async_server_in_batch.gin \
'--gin.ENV_IDS=[]' \
'--gin.AGENT1_MODEL="gpt-3.5-turbo"' \
'--gin.AGENT2_MODEL="gpt-3.5-turbo"' \
'--gin.BATCH_SIZE=5' \
'--gin.TAG="lite_gpt3.5_gpt3.5"' \
'--gin.TAG_TO_CHECK_EXISTING_EPISODES="lite_gpt3.5_gpt3.5"' \
'--gin.PUSH_TO_DB=False' \
'--gin.OMNISCIENT=False' \
'--gin.VERBOSE=False' \
'--gin.LITE=True' \
```

### Sotopia-script

- `script`: The simulation runs with enabling LLMs generating the interaction in one shot with a script writing setting. To use this mode, set `script` to `True` in the gin configuration command.

e.g.,

```bash
python examples/generate_script.py \
--gin_file sotopia_conf/generation_utils_conf/generate.gin \
--gin_file sotopia_conf/run_async_server_in_batch_script.gin \
'--gin.ENV_IDS=[]' \
'--gin.SCRIPT_MODEL="gpt-3.5-turbo"' \
'--gin.BATCH_SIZE=5' \
'--gin.TAG="lite_script_gpt3.5_gpt3.5"' \
'--gin.TAG_TO_CHECK_EXISTING_EPISODES="lite_script_gpt3.5_gpt3.5"' \
'--gin.PUSH_TO_DB=True' \
'--gin.VERBOSE=False' \
```
145 changes: 145 additions & 0 deletions examples/evaluate_existing_episode.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
import asyncio
import logging
import subprocess
import typing
from datetime import datetime
from logging import FileHandler

import gin
import typer
from experiment_eval import _iterate_env_agent_combo_not_in_db
from rich import print
from rich.logging import RichHandler
from tqdm import tqdm
from tqdm.asyncio import tqdm_asyncio

from sotopia.agents.llm_agent import Agents
from sotopia.database.logs import AnnotationForEpisode, EpisodeLog
from sotopia.database.persistent_profile import EnvironmentProfile
from sotopia.generation_utils.generate import LLM_Name, agenerate_script
from sotopia.messages.message_classes import (
AgentAction,
Observation,
ScriptBackground,
)
from sotopia.samplers import (
BaseSampler,
ConstraintBasedSampler,
EnvAgentCombo,
)
from sotopia.server import aevaluate_one_episode, arun_one_script

# date and message only
FORMAT = "%(asctime)s - %(levelname)s - %(name)s - %(message)s"

process = subprocess.Popen(
["git", "rev-parse", "HEAD"], shell=False, stdout=subprocess.PIPE
)
git_head_hash = process.communicate()[0].strip()

logging.basicConfig(
level=15,
format=FORMAT,
datefmt="[%X]",
handlers=[
RichHandler(),
FileHandler(
datetime.now().strftime(
f"./logs/%H_%M_%d_%m_%Y_{str(git_head_hash.decode('utf-8'))}.log"
)
),
],
)
app = typer.Typer()


def run_async_server_in_batch_aevaluate(
batch_size: int = 10,
model: LLM_Name = "gpt-4",
reeval_list: list[str] = [],
tag: str | None = None,
push_to_db: bool = False,
verbose: bool = False,
) -> None:

if not verbose:
logger = logging.getLogger()
logger.setLevel(logging.CRITICAL)
rich_handler = logger.handlers[0]
logger.removeHandler(rich_handler)

episode_batch: list[EpisodeLog] = []

while True:
for env_pk in tqdm(reeval_list):
episode = EpisodeLog.get(env_pk)
episode_batch.append(episode)
if len(episode_batch) == batch_size:
logging.info(
f"Running batch of {batch_size} episodes: {episode_batch}"
)
episode_futures = [
aevaluate_one_episode(
episode=episode,
model=model,
tag=tag,
push_to_db=push_to_db,
)
for episode in episode_batch
]
asyncio.run(
tqdm_asyncio.gather(
*episode_futures, desc="Running one batch"
)
)

episode_batch = []
else:
if episode_batch:
logging.info(
f"Running batch of {batch_size} episodes: {episode_batch}"
)
episode_futures = [
aevaluate_one_episode(
episode=episode,
model=model,
tag=tag,
push_to_db=push_to_db,
)
for episode in episode_batch
]
asyncio.run(
tqdm_asyncio.gather(
*episode_futures, desc="Running one batch"
)
)
return


@app.command()
def run_server(
tag: str = "reeval_llama2",
model: str = "togethercomputer/llama-2-70b-chat", # Why typer does not accept LLM_Name?
batch_size: int = 10,
push_to_db: bool = True,
verbose: bool = False,
) -> None:
annotated_episodes_pks = [
AnnotationForEpisode.get(anno).episode
for anno in AnnotationForEpisode.all_pks()
]
annotated_episodes_pks = list(set(annotated_episodes_pks))
model = typing.cast(LLM_Name, model)
# Call the function with the specified parameters
run_async_server_in_batch_aevaluate(
tag=tag,
model=model,
batch_size=batch_size,
push_to_db=push_to_db,
verbose=verbose,
reeval_list=annotated_episodes_pks,
)


if __name__ == "__main__":
app()
Loading

0 comments on commit 0c612b2

Please sign in to comment.