Skip to content

Commit

Permalink
add/lamini test
Browse files Browse the repository at this point in the history
  • Loading branch information
nitya committed Apr 15, 2024
1 parent 2a8340b commit 2c741d7
Show file tree
Hide file tree
Showing 6 changed files with 294 additions and 111 deletions.
3 changes: 3 additions & 0 deletions .env.sample
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,8 @@ AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT='<add your embeddings model name here>'
## Hugging Face
HUGGING_FACE_API_KEY='<add your HuggingFace API or token here>'

## Lamini
LAMINI_API_KEY='<add your Lamini API key here>'

## GitHub Personal Access Token
30DAYSOF_PAT='<add your GitHub Personal Access Token here>'
160 changes: 160 additions & 0 deletions notebooks/400/400-00-llm-setup.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -467,6 +467,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"## 3. Hugging Face\n",
"\n",
"We should use the [latest documentation](https://huggingface.co/docs) and focus initially on the [Text Generation Inference](https://huggingface.co/docs/text-generation-inference/index) capability. We can use this in two ways:\n",
Expand Down Expand Up @@ -831,6 +833,164 @@
")\n",
"print(\"Response:\\n \", data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"## 4. Lamini\n",
"\n",
"Lamini is an LLM platform [optimized for enterprise fine tuning](https://lamini-ai.github.io/about/).\n",
" - Explore the [Lamini SDK](https://github.com/lamini-ai/lamini-sdk/)\n",
" - Explore tools for [better inference](https://lamini-ai.github.io/inference/quick_tour/) \n",
" - Exolore tools for [better training](https://lamini-ai.github.io/training/quick_tour/)\n",
"\n",
"To get started:\n",
" - Create an account and get an API key\n",
" - Validate the key works with sample questions as shown\n",
"\n",
"Note: The free account only gives you 200 calls _total_ (no refresh) so use it wisely.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# Install the lamini package\n",
"# !pip install --upgrade lamini\n",
"\n",
"## Configure the API key\n",
"import lamini\n",
"import os\n",
"lamini.api_key = os.getenv(\"LAMINI_API_KEY\")"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-04-15:03:33:11,976 INFO [lamini.py:33] Using 3.10 InferenceQueue Interface\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"status code: 513 https://api.lamini.ai/v1/completions\n"
]
},
{
"ename": "APIError",
"evalue": "API error {'detail': \"error_id: 243549526076307102879929981439376352577: Downloading the 'Intel/neural-chat-7b-v3-1' model. Please try again in a few minutes.\"}",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mHTTPError\u001b[0m Traceback (most recent call last)",
"File \u001b[0;32m~/.python/current/lib/python3.10/site-packages/lamini/api/rest_requests.py:132\u001b[0m, in \u001b[0;36mmake_web_request\u001b[0;34m(key, url, http_method, json)\u001b[0m\n\u001b[1;32m 131\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 132\u001b[0m \u001b[43mresp\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mraise_for_status\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 133\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m requests\u001b[38;5;241m.\u001b[39mexceptions\u001b[38;5;241m.\u001b[39mHTTPError \u001b[38;5;28;01mas\u001b[39;00m e:\n",
"File \u001b[0;32m~/.local/lib/python3.10/site-packages/requests/models.py:1021\u001b[0m, in \u001b[0;36mResponse.raise_for_status\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 1020\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m http_error_msg:\n\u001b[0;32m-> 1021\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m HTTPError(http_error_msg, response\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m)\n",
"\u001b[0;31mHTTPError\u001b[0m: 513 Server Error: for url: https://api.lamini.ai/v1/completions",
"\nDuring handling of the above exception, another exception occurred:\n",
"\u001b[0;31mAPIError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[23], line 22\u001b[0m\n\u001b[1;32m 17\u001b[0m \u001b[38;5;66;03m## Option 1: Use a named model to get an endpoint for requests\u001b[39;00m\n\u001b[1;32m 18\u001b[0m \u001b[38;5;66;03m## Models may not be pre-loaded in HF inference service - you will then see this error, so retry:\u001b[39;00m\n\u001b[1;32m 19\u001b[0m \u001b[38;5;66;03m## Downloading the 'Intel/neural-chat-7b-v3-1' model. \u001b[39;00m\n\u001b[1;32m 20\u001b[0m \u001b[38;5;66;03m## Please try again in a few minutes.\u001b[39;00m\n\u001b[1;32m 21\u001b[0m llm \u001b[38;5;241m=\u001b[39m lamini\u001b[38;5;241m.\u001b[39mLamini(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mIntel/neural-chat-7b-v3-1\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m---> 22\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[43mllm\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgenerate\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mHow to convert inches to centimeters? Answer in 2 sentences\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m)\n",
"File \u001b[0;32m~/.python/current/lib/python3.10/site-packages/lamini/api/lamini.py:71\u001b[0m, in \u001b[0;36mLamini.generate\u001b[0;34m(self, prompt, model_name, output_type, max_tokens, max_new_tokens, callback, metadata)\u001b[0m\n\u001b[1;32m 63\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(prompt, \u001b[38;5;28mstr\u001b[39m) \u001b[38;5;129;01mor\u001b[39;00m (\u001b[38;5;28misinstance\u001b[39m(prompt, \u001b[38;5;28mlist\u001b[39m) \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(prompt) \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m1\u001b[39m):\n\u001b[1;32m 64\u001b[0m req_data \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mmake_llm_req_map(\n\u001b[1;32m 65\u001b[0m prompt\u001b[38;5;241m=\u001b[39mprompt,\n\u001b[1;32m 66\u001b[0m model_name\u001b[38;5;241m=\u001b[39mmodel_name \u001b[38;5;129;01mor\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mmodel_name,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 69\u001b[0m max_new_tokens\u001b[38;5;241m=\u001b[39mmax_new_tokens,\n\u001b[1;32m 70\u001b[0m )\n\u001b[0;32m---> 71\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcompletion\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgenerate\u001b[49m\u001b[43m(\u001b[49m\u001b[43mreq_data\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 72\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m output_type \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 73\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(prompt, \u001b[38;5;28mlist\u001b[39m) \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(prompt) \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m1\u001b[39m:\n",
"File \u001b[0;32m~/.python/current/lib/python3.10/site-packages/lamini/api/utils/completion.py:15\u001b[0m, in \u001b[0;36mCompletion.generate\u001b[0;34m(self, params)\u001b[0m\n\u001b[1;32m 14\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mgenerate\u001b[39m(\u001b[38;5;28mself\u001b[39m, params):\n\u001b[0;32m---> 15\u001b[0m resp \u001b[38;5;241m=\u001b[39m \u001b[43mmake_web_request\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 16\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mapi_key\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mapi_prefix\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m+\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mcompletions\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mpost\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mparams\u001b[49m\n\u001b[1;32m 17\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 18\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m resp\n",
"File \u001b[0;32m~/.python/current/lib/python3.10/site-packages/lamini/api/rest_requests.py:183\u001b[0m, in \u001b[0;36mmake_web_request\u001b[0;34m(key, url, http_method, json)\u001b[0m\n\u001b[1;32m 181\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m description \u001b[38;5;241m==\u001b[39m {\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdetail\u001b[39m\u001b[38;5;124m\"\u001b[39m: \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m\"\u001b[39m}:\n\u001b[1;32m 182\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m APIError(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m500 Internal Server Error\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m--> 183\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m APIError(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mAPI error \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mdescription\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 185\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m resp\u001b[38;5;241m.\u001b[39mjson()\n",
"\u001b[0;31mAPIError\u001b[0m: API error {'detail': \"error_id: 243549526076307102879929981439376352577: Downloading the 'Intel/neural-chat-7b-v3-1' model. Please try again in a few minutes.\"}"
]
}
],
"source": [
"## Validate setup with a named model from Hugging Face \n",
"## By default the free-tier user has support for these base models (identified in error message)\n",
"'''\n",
"'hf-internal-testing/tiny-random-gpt2', \n",
"'EleutherAI/pythia-70m', 'EleutherAI/pythia-70m-deduped', 'EleutherAI/pythia-70m-v0', \n",
"'EleutherAI/pythia-70m-deduped-v0', 'EleutherAI/neox-ckpt-pythia-70m-deduped-v0', 'EleutherAI/neox-ckpt-pythia-70m-v1', \n",
"'EleutherAI/neox-ckpt-pythia-70m-deduped-v1', 'EleutherAI/gpt-neo-125m', 'EleutherAI/pythia-160m', \n",
"'EleutherAI/pythia-160m-deduped', 'EleutherAI/pythia-160m-deduped-v0', 'EleutherAI/neox-ckpt-pythia-70m', \n",
"'EleutherAI/neox-ckpt-pythia-160m', 'EleutherAI/neox-ckpt-pythia-160m-deduped-v1', 'EleutherAI/pythia-2.8b', \n",
"'EleutherAI/pythia-410m', 'EleutherAI/pythia-410m-v0', 'EleutherAI/pythia-410m-deduped', \n",
"'EleutherAI/pythia-410m-deduped-v0', 'EleutherAI/neox-ckpt-pythia-410m', 'EleutherAI/neox-ckpt-pythia-410m-deduped-v1', \n",
"'cerebras/Cerebras-GPT-111M', 'cerebras/Cerebras-GPT-256M', 'meta-llama/Llama-2-7b-hf', \n",
"'meta-llama/Llama-2-7b-chat-hf', 'meta-llama/Llama-2-13b-chat-hf', 'meta-llama/Llama-2-70b-chat-hf', \n",
"'Intel/neural-chat-7b-v3-1', 'mistralai/Mistral-7B-Instruct-v0.1', 'microsoft/phi-2'\n",
"'''\n",
"\n",
"## Option 1: Use a named model to get an endpoint for requests\n",
"## Models may not be pre-loaded in HF inference service - you will then see this error, so retry:\n",
"## Downloading the 'cerebras/Cerebras-GPT-111M' model. \n",
"## Please try again in a few minutes.\n",
"llm = lamini.Lamini(\"cerebras/Cerebras-GPT-111M\")\n",
"print(llm.generate(\"How to convert inches to centimeters? Answer in 2 sentences\"))"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-04-15:03:29:20,700 INFO [lamini.py:33] Using 3.10 InferenceQueue Interface\n"
]
},
{
"data": {
"text/plain": [
"' To convert inches to centimeters, you can multiply the number of inches by 2.54. For example, 1 inch is equal to 2.54 centimeters.'"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"## Option 2: Use pre-defined Mistral runner\n",
"llm = lamini.MistralRunner()\n",
"llm(\"How to convert inches to centimeters? Answer in 2 sentences\")"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-04-15:03:29:35,899 INFO [lamini.py:33] Using 3.10 InferenceQueue Interface\n"
]
},
{
"data": {
"text/plain": [
"' Of course! To convert inches to centimeters, you can use the following conversion factor: 1 inch = 2.54 centimeters. Therefore, if you want to convert a measurement in inches to centimeters, you can simply multiply it by 2.54.'"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"## Option 3: Use pre-defined LLama-2 runner\n",
"llama = lamini.LlamaV2Runner()\n",
"llama(\"How to convert inches to centimeters? Answer in 2 sentences\")"
]
}
],
"metadata": {
Expand Down
127 changes: 127 additions & 0 deletions notebooks/400/400-02-dl-fine-tuning.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 400.9 | Fine Tuning Large Language Models\n",
"\n",
" **This notebook is for my personal use only** - all sources are cited. If you are following the same learning journey please reference original sources instead. Key Resources used include:\n",
"1. [Fine Tuning Large Language Models](https://www.deeplearning.ai/short-courses/finetuning-large-language-models/), _DeepLearning.AI_ (2024)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"## Learn: Concepts\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### What is Fine Tuning?\n",
"\n",
"Process of turning a general-purpose pre-trained model into a speciliazed version suited for a particular task. Analogy: a general practitioner (GP) vs. a specialist (cardiologist).\n",
"\n",
"Both fine-tuning and prompt-engineering are techniques to improve the quality of a model's response to a user request but differ in cost and context:\n",
" - Prompt engineering is easier to implement, has less upfront cost\n",
" - Prompt engineering has data limitations (fewer examples), more hallucinations\n",
" - Fine-tuning is more effective but has upfront compute & data processing costs\n",
" - Fine-tuning requires high-quality data and more expertise in model training\n",
"\n",
"### Why Fine Tune?\n",
"1. More cost-effective - (per-request) frees up space used by examples, context\n",
"1. More consistent outputs - understands app requirements, response formats\n",
"1. Reduce hallucinations - grounded in relevant data, critical for enterprise\n",
"1. Improve data privacy - reduce breaches, data leakage in training\n",
"1. Better performance - reliability, lower latency, better moderation options"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"## Apply: Tasks\n",
"\n",
"> The following exercises should help walk through the entire process of fine-tuning a large language model using a specific provider and model endpoint."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"### T1: Setup Dev Environment\n",
"\n",
"To explore ideas in practice, we need access to a relevant Large Language Model (LLM) and provider-hosted endpoint (API). Use the [LLM Setup](./400-00-aoia-intro.ipynb) notebook to configure environment variables and validate setup for supported providers including:\n",
" - Open AI\n",
" - Azure Open AI\n",
" - Hugging Face\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"### T2: Lamini Example\n",
"\n",
"The example from the DeepLearning.AI course uses the following libraries:\n",
"- PyTorch (Meta) - lowest level\n",
"- Transformers (Hugging Face) - abstracts PyTorch for easier use\n",
"- Llama (Lamini) - abstracts working with LLama models\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install the lamini package\n",
"import lamini\n",
"import os\n",
"lamini.api_key = os.getenv(\"LAMINI_API_KEY\")\n",
"\n",
"# Test the installation\n",
"from llama import BasicModelRunner\n",
"non_ft_model = BasicModelRunner(\"meta-llama/LLama-3-7b-hf\")\n",
"print(non_ft_model(\"Oh say can you see\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading

0 comments on commit 2c741d7

Please sign in to comment.