Skip to content

Commit

Permalink
Getting started section in README for icpp_llama2
Browse files Browse the repository at this point in the history
  • Loading branch information
icppWorld committed Mar 7, 2024
1 parent 3ade641 commit 5dd6c5d
Show file tree
Hide file tree
Showing 5 changed files with 89 additions and 48 deletions.
6 changes: 2 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@

*The LLMs of this repo run in it's back-end canisters.*

# Getting Started

A step-by-step guide to deploy your first LLM to the internet computer is provided in [icpp_llama2/README.md](https://github.com/icppWorld/icpp_llm/blob/main/icpp_llama2/README.md).

# The Benefits of Running LLMs On-Chain

Expand All @@ -27,10 +29,6 @@ Coherent English?](https://arxiv.org/pdf/2305.07759.pdf)
Besides the ease of use and the enhanced security, running LLMs directly on-chain also facilitates a seamless integration of tokenomics, eliminating the need to juggle between a complex blend of web3 and web2 components, and I believe it will lead to a new category of Generative AI based dApps.


## Instructions

See the README in the icpp_llama2 folder


## Support

Expand Down
105 changes: 74 additions & 31 deletions icpp_llama2/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,18 @@
# [karpathy/llama2.c](https://github.com/karpathy/llama2.c) for the Internet Computer

# Instructions
# Getting Started

- Install the C++ development environment for the Internet Computer ([docs](https://docs.icpp.world/installation.html)):
- Create a python environment. (We like MiniConda, but use whatever you like!)
```bash
conda create --name myllama2 python=3.11
conda activate myllama2
```
- Clone this repo and enter the icpp_llama2 folder
```bash
git clone https://github.com/icppWorld/icpp_llm.git
cd icpp_llm/icpp_llama2
```
- Install the required python packages *(icpp-pro & ic-py)*:
```bash
pip install -r requirements.txt
Expand All @@ -16,11 +26,71 @@
sh -ci "$(curl -fsSL https://internetcomputer.org/install.sh)"
```
*(Note: On Windows, just install dfx in wsl, and icpp-pro in PowerShell will know where to find it. )*


- Get a model checkpoint, as explained in [karpathy/llama2.c](https://github.com/karpathy/llama2.c):
- Deploy the smallest pre-trained model to canister `llama2_260K`:
- Start the local network:
```bash
dfx start --clean
```
- Compile & link to WebAssembly (wasm), as defined in `icpp.toml`:
```bash
icpp build-wasm
```
- Deploy the wasm to a canister on the local network:
```bash
dfx deploy llama2_260k
```
- Check the health endpoint of the `llama2_260k` canister:
```bash
dfx canister call llama2_260k health
```
- Upload the 260k parameter model & tokenizer:
```bash
python -m scripts.upload --network local --canister llama2_260K --model stories260K/stories260K.bin --tokenizer stories260K/tok512.bin
```
- Check the readiness endpoint, indicating it can be used for inference:
```bash
dfx canister call llama2_260k ready
```

- Test it with dfx:
- Generate a new story, 10 tokens at a time, starting with an empty prompt:
```bash
dfx canister call llama2_260k new_chat '()'
dfx canister call llama2_260k inference '(record {prompt = "" : text; steps = 10 : nat64; temperature = 0.9 : float32; topp = 0.9 : float32; rng_seed = 0 : nat64;})'
dfx canister call llama2_260k inference '(record {prompt = "" : text; steps = 10 : nat64; temperature = 0.9 : float32; topp = 0.9 : float32; rng_seed = 0 : nat64;})'
# etc.
```
- Generate a new story, starting with a non-empty:
```bash
dfx canister call llama2_260k new_chat '()'
dfx canister call llama2_260k inference '(record {prompt = "Jenny climbed in a tree" : text; steps = 10 : nat64; temperature = 0.9 : float32; topp = 0.9 : float32; rng_seed = 0 : nat64;})'
dfx canister call llama2_260k inference '(record {prompt = "" : text; steps = 10 : nat64; temperature = 0.9 : float32; topp = 0.9 : float32; rng_seed = 0 : nat64;})'
# etc.
```

# Next steps

As you test the smallest pre-trained model, llama2_260k, you quickly realize that it is not a very good model. The stories generated are not comprehensible. This is simply because the model is not large enough. It is just for verifying that your build, deploy and test pipeline is functional.

You also will notice that using dfx to generate stories is not very user friendly. We build a little frontend to generate stories, available as an open source project: https://github.com/icppWorld/icgpt, and deployed to the IC as deployed as [ICGPT](https://icgpt.icpp.world/).

As next challenges, some ideas:
- Deploy the 15M parameter model
- Test out the 15M model at [ICGPT](https://icgpt.icpp.world/)
- Test the influence of `temperature` and `topp` on the storie generation
- Build your own frontend
- Train your own model and deploy it
- Study the efficiency of the LLM, and look for improvements
- etc.

Some further instructions are provided below.

## Deploy the 15M parameter pre-trained model

- You can get other model checkpoints, as explained in [karpathy/llama2.c](https://github.com/karpathy/llama2.c):

This command downloads the 15M parameter model that was trained on the TinyStories dataset (~60MB download) and stores it in a `models` folder:
For example, this command downloads the 15M parameter model that was trained on the TinyStories dataset (~60MB download) and stores it in a `models` folder:

```bash
# on Linux/Mac
Expand All @@ -45,33 +115,6 @@
![icpp_llama2_without_limits](../assets/icpp_llama2_without_limits.png)
# stories260K
The default model is`stories15M.bin`, with `tokenizer.bin`, which contains the default llama2 tokenizer using 32000 tokens.
For testing, it is nice to be able to work with a smaller model & tokenizer:
- Download the model & tokenizer from [huggingface stories260K](https://huggingface.co/karpathy/tinyllamas/tree/main/stories260K) and store them in:
- stories260K/stories260K.bin
- stories260K/tok512.bin
- stories260K/tok512.model
- Deploy the canister:
```bash
icpp build-wasm
dfx deploy
```
- Upload the model & tokenizer:
```bash
python -m scripts.upload --model stories260K/stories260K.bin --tokenizer stories260K/tok512.bin
```
- Inference is now possible with many more tokens before hitting the instruction limit, but off course, the stories are not as good:
```bash
$ dfx canister call llama2 inference '(record {prompt = "Lilly went swimming yesterday " : text; steps = 100 : nat64; temperature = 0.9 : float32; topp = 0.9 : float32; rng_seed = 0 : nat64;})'
(
variant {
ok = "Lilly went swimming yesterday order. She had a great eyes that was closed. One day, she asked her mom why the cloud was close to the pond. \n\"Mommy, I will take clothes away,\" Lila said. \"Th\n"
},
)
```
# Fine tuning
Expand Down
2 changes: 1 addition & 1 deletion icpp_llama2/demo.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Write-Host $output -ForegroundColor Green
#######################################################################
Write-Host " "
Write-Host "--------------------------------------------------"
Write-Host "Building the wasm with wasi-sdk"
Write-Host "Building the wasm with wasi-sdk, as defined in icpp.toml"
icpp build-wasm --to-compile all
# icpp build-wasm --to-compile mine

Expand Down
2 changes: 1 addition & 1 deletion icpp_llama2/demo.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ dfx start --clean --background

#######################################################################
echo "--------------------------------------------------"
echo "Building the wasm with wasi-sdk"
echo "Building the wasm with wasi-sdk, as defined in icpp.toml"
icpp build-wasm --to-compile all
# icpp build-wasm --to-compile mine

Expand Down
22 changes: 11 additions & 11 deletions icpp_llama2/scripts/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
requests
pandas
pandas-stubs
jupyterlab
jupyterlab-lsp
jupyter-black
python-lsp-server[all]
# pandas
# pandas-stubs
# jupyterlab
# jupyterlab-lsp
# jupyter-black
# python-lsp-server[all]
python-dotenv
tabulate
# tabulate
black
mypy
pylint==2.13.9
matplotlib
fastparquet
openpyxl
seaborn
# matplotlib
# fastparquet
# openpyxl
# seaborn

0 comments on commit 5dd6c5d

Please sign in to comment.