Skip to content

Commit

Permalink
[Docs] Fix imgur links (#3846)
Browse files Browse the repository at this point in the history
* Fix imgur links

* Remove unecessary file

* revert
  • Loading branch information
Michaelvll authored Aug 19, 2024
1 parent fffeacd commit 90de1b2
Show file tree
Hide file tree
Showing 10 changed files with 19 additions and 19 deletions.
2 changes: 1 addition & 1 deletion docs/source/examples/interactive-development.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ This is supported by simply connecting VSCode to the cluster with the cluster na

For more details, please refer to the `VSCode documentation <https://code.visualstudio.com/docs/remote/ssh-tutorial>`__.

.. image:: https://imgur.com/8mKfsET.gif
.. image:: https://i.imgur.com/8mKfsET.gif
:align: center
:alt: Connect to the cluster with VSCode

Expand Down
4 changes: 2 additions & 2 deletions llm/codellama/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,14 @@ The followings are the demos of Code Llama 70B hosted by SkyPilot Serve (aka Sky
## Demos
<figure>
<center>
<img src="https://imgur.com/fguAmP0.gif" width="60%" title="Coding Assistant: Connect to hosted Code Llama with Tabby in VScode" />
<img src="https://i.imgur.com/fguAmP0.gif" width="60%" title="Coding Assistant: Connect to hosted Code Llama with Tabby in VScode" />

<figcaption>Coding Assistant: Connect to hosted Code Llama with Tabby in VScode</figcaption>
</figure>

<figure>
<center>
<img src="https://imgur.com/Dor1MoE.gif" width="60%" title="Chat: Connect to hosted Code Llama with FastChat" />
<img src="https://i.imgur.com/Dor1MoE.gif" width="60%" title="Chat: Connect to hosted Code Llama with FastChat" />

<figcaption>Chat: Connect to hosted Code Llama with FastChat</figcaption>
</figure>
Expand Down
2 changes: 1 addition & 1 deletion llm/falcon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ sky launch -c falcon -s falcon.yaml --no-use-spot

For reference, below is a loss graph you may expect to see, and the amount of time and the approximate cost of fine-tuning each of the models over 500 epochs (assuming a spot instance A100 GPU rate at $1.1 / hour and a A100-80GB rate of $1.61 / hour):

<img width="524" alt="image" src="https://imgur.com/BDlHink.png">
<img width="524" alt="image" src="https://i.imgur.com/BDlHink.png">

1. `ybelkada/falcon-7b-sharded-bf16`: 2.5 to 3 hours using 1 A100 spot GPU; total cost ≈ $3.3.

Expand Down
8 changes: 4 additions & 4 deletions llm/gpt-2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,22 +28,22 @@ Run the following command to start GPT-2 (124M) training on a GPU VM with 8 A100
sky launch -c gpt2 gpt2.yaml
```

![GPT-2 training with 8 A100 GPUs](https://imgur.com/v8SGpsF.png)
![GPT-2 training with 8 A100 GPUs](https://i.imgur.com/v8SGpsF.png)

Or, you can train the model with a single A100, by adding `--gpus A100`:
```bash
sky launch -c gpt2 gpt2.yaml --gpus A100
```

![GPT-2 training with a single A100](https://imgur.com/hN65g4r.png)
![GPT-2 training with a single A100](https://i.imgur.com/hN65g4r.png)


It is also possible to speed up the training of the model on 8 H100 (2.3x more tok/s than 8x A100s):
```bash
sky launch -c gpt2 gpt2.yaml --gpus H100:8
```

![GPT-2 training with 8 H100](https://imgur.com/STbi80b.png)
![GPT-2 training with 8 H100](https://i.imgur.com/STbi80b.png)

### Download logs and visualizations

Expand All @@ -54,7 +54,7 @@ scp -r gpt2:~/llm.c/log124M .
We can visualize the training progress with the notebook provided in [llm.c](https://github.com/karpathy/llm.c/blob/master/dev/vislog.ipynb). (Note: we cut off the training after 10K steps, which already achieve similar validation loss as OpenAI GPT-2 checkpoint.)

<div align="center">
<img src="https://imgur.com/lskPEAQ.png" width="60%">
<img src="https://i.imgur.com/lskPEAQ.png" width="60%">
</div>

> Yes! We are able to reproduce the training of GPT-2 (124M) on any cloud with SkyPilot.
Expand Down
2 changes: 1 addition & 1 deletion llm/llama-2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,6 @@ You can also host the official FAIR model without using huggingface and gradio.
```

3. Open http://localhost:7681 in your browser and start chatting!
<img src="https://imgur.com/Ay8sDhG.png" alt="LLaMA chatbot running on the cloud via SkyPilot"/>
<img src="https://i.imgur.com/Ay8sDhG.png" alt="LLaMA chatbot running on the cloud via SkyPilot"/>


4 changes: 2 additions & 2 deletions llm/llama-3/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@


<p align="center">
<img src="https://imgur.com/1NEZs9f.png" alt="Llama-3 x SkyPilot" style="width: 50%;">
<img src="https://i.imgur.com/1NEZs9f.png" alt="Llama-3 x SkyPilot" style="width: 50%;">
</p>

[Llama-3](https://github.com/meta-llama/llama3) is the latest top open-source LLM from Meta. It has been released with a license that authorizes commercial use. You can deploy a private Llama-3 chatbot with SkyPilot in your own cloud with just one simple command.
Expand Down Expand Up @@ -248,7 +248,7 @@ To use the Gradio UI, open the URL shown in the logs:


<p align="center">
<img src="https://imgur.com/zPpY2Bg.gif" alt="Gradio UI serving Llama-3" style="width: 80%;">
<img src="https://i.imgur.com/zPpY2Bg.gif" alt="Gradio UI serving Llama-3" style="width: 80%;">
</p>

To stop the instance:
Expand Down
6 changes: 3 additions & 3 deletions llm/llama-3_1-finetuning/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ sky launch -c llama31 lora.yaml \

<figure>
<center>
<img src="https://imgur.com/B7Ib4Ii.png" width="60%" />
<img src="https://i.imgur.com/B7Ib4Ii.png" width="60%" />

<figcaption>Training Loss of LoRA finetuning Llama 3.1</figcaption>
Expand Down Expand Up @@ -218,10 +218,10 @@ run: |
## Appendix: Preparation
1. Request the access to [Llama 3.1 weights on huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) (Click on the blue box and follow the steps):
![](https://imgur.com/snIQhr9.png)
![](https://i.imgur.com/snIQhr9.png)
2. Get your [huggingface access token](https://huggingface.co/settings/tokens):
![](https://imgur.com/3idBgHn.png)
![](https://i.imgur.com/3idBgHn.png)
3. Add huggingface token to your environment variable:
Expand Down
2 changes: 1 addition & 1 deletion llm/lorax/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<!-- $UNCOMMENT# LoRAX: Multi-LoRA Inference Server -->

<p align="center">
<img src="https://imgur.com/OUapRYC.png" alt="LoRAX" style="width:200px;" />
<img src="https://i.imgur.com/OUapRYC.png" alt="LoRAX" style="width:200px;" />
</p>

[LoRAX](https://github.com/predibase/lorax) (LoRA eXchange) is a framework that allows users to serve thousands of fine-tuned LLMs on a single GPU, dramatically reducing the cost of serving without compromising on throughput or latency. It works by dynamically loading multiple fine-tuned "adapters" (LoRAs, etc.) on top of a single base model at runtime. Concurrent requests for different adapters can be processed together in a single batch, allowing LoRAX to maintain near linear throughput scaling as the number of adapters increases.
Expand Down
6 changes: 3 additions & 3 deletions llm/vicuna-llama-2/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Train Your Own Vicuna on Llama-2

![Vicuna-Llama-2](https://imgur.com/McZWg6z.gif "Result model in action, trained using this guide. From the SkyPilot and Vicuna teams.")
![Vicuna-Llama-2](https://i.imgur.com/McZWg6z.gif "Result model in action, trained using this guide. From the SkyPilot and Vicuna teams.")

Meta released [Llama 2](https://ai.meta.com/llama/) two weeks ago and has made a big wave in the AI community. In our opinion, its biggest impact is that the model is now released under a [permissive license](https://github.com/facebookresearch/llama/blob/main/LICENSE) that **allows the model weights to be used commercially**[^1]. This differs from Llama 1 which cannot be used commercially.

Expand Down Expand Up @@ -106,7 +106,7 @@ sky launch --no-use-spot ...


<p align="center">
<img src="https://imgur.com/yVIXfQo.gif" width="100%" alt="Optimizer"/>
<img src="https://i.imgur.com/yVIXfQo.gif" width="100%" alt="Optimizer"/>
</p>

**Optional**: Try out the training for the 13B model:
Expand Down Expand Up @@ -139,7 +139,7 @@ sky launch -c serve serve.yaml --env MODEL_CKPT=<your-model-checkpoint>/chatbot/
```
In [serve.yaml](https://github.com/skypilot-org/skypilot/tree/master/llm/vicuna-llama-2/serve.yaml), we specified launching a Gradio server that serves the model checkpoint at `<your-model-checkpoint>/chatbot/7b`.

![Vicuna-Llama-2](https://imgur.com/McZWg6z.gif "Serving the resulting model with Gradio.")
![Vicuna-Llama-2](https://i.imgur.com/McZWg6z.gif "Serving the resulting model with Gradio.")


> **Tip**: You can also switch to a cheaper accelerator, such as L4, to save costs, by adding `--gpus L4` to the above command.
2 changes: 1 addition & 1 deletion llm/vllm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<!-- $UNCOMMENT# vLLM: Easy, Fast, and Cheap LLM Inference -->

<p align="center">
<img src="https://imgur.com/yxtzPEu.png" alt="vLLM"/>
<img src="https://i.imgur.com/yxtzPEu.png" alt="vLLM"/>
</p>

This README contains instructions to run a demo for vLLM, an open-source library for fast LLM inference and serving, which improves the throughput compared to HuggingFace by **up to 24x**.
Expand Down

0 comments on commit 90de1b2

Please sign in to comment.