From 90de1b2564d16463a49f198199da4dc3e9540695 Mon Sep 17 00:00:00 2001
From: Zhanghao Wu <zhanghao.wu@outlook.com>
Date: Mon, 19 Aug 2024 16:50:53 -0700
Subject: [PATCH] [Docs] Fix imgur links (#3846)

* Fix imgur links

* Remove unecessary file

* revert
---
 docs/source/examples/interactive-development.rst | 2 +-
 llm/codellama/README.md                          | 4 ++--
 llm/falcon/README.md                             | 2 +-
 llm/gpt-2/README.md                              | 8 ++++----
 llm/llama-2/README.md                            | 2 +-
 llm/llama-3/README.md                            | 4 ++--
 llm/llama-3_1-finetuning/readme.md               | 6 +++---
 llm/lorax/README.md                              | 2 +-
 llm/vicuna-llama-2/README.md                     | 6 +++---
 llm/vllm/README.md                               | 2 +-
 10 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/docs/source/examples/interactive-development.rst b/docs/source/examples/interactive-development.rst
index cc50f8e6ea8..40920934597 100644
--- a/docs/source/examples/interactive-development.rst
+++ b/docs/source/examples/interactive-development.rst
@@ -110,7 +110,7 @@ This is supported by simply connecting VSCode to the cluster with the cluster na
 
 For more details, please refer to the `VSCode documentation <https://code.visualstudio.com/docs/remote/ssh-tutorial>`__.
 
-.. image:: https://imgur.com/8mKfsET.gif
+.. image:: https://i.imgur.com/8mKfsET.gif
   :align: center
   :alt: Connect to the cluster with VSCode
 
diff --git a/llm/codellama/README.md b/llm/codellama/README.md
index 8e5025d22b5..f145fd062ff 100644
--- a/llm/codellama/README.md
+++ b/llm/codellama/README.md
@@ -10,14 +10,14 @@ The followings are the demos of Code Llama 70B hosted by SkyPilot Serve (aka Sky
 ## Demos
 <figure>
 <center>
-<img src="https://imgur.com/fguAmP0.gif" width="60%" title="Coding Assistant: Connect to hosted Code Llama with Tabby in VScode" />
+<img src="https://i.imgur.com/fguAmP0.gif" width="60%" title="Coding Assistant: Connect to hosted Code Llama with Tabby in VScode" />
 
 <figcaption>Coding Assistant: Connect to hosted Code Llama with Tabby in VScode</figcaption>
 </figure>
 
 <figure>
 <center>
-<img src="https://imgur.com/Dor1MoE.gif" width="60%" title="Chat: Connect to hosted Code Llama with FastChat" />
+<img src="https://i.imgur.com/Dor1MoE.gif" width="60%" title="Chat: Connect to hosted Code Llama with FastChat" />
 
 <figcaption>Chat: Connect to hosted Code Llama with FastChat</figcaption>
 </figure>
diff --git a/llm/falcon/README.md b/llm/falcon/README.md
index 837e93f5558..6eb480d9ea8 100644
--- a/llm/falcon/README.md
+++ b/llm/falcon/README.md
@@ -50,7 +50,7 @@ sky launch -c falcon -s falcon.yaml --no-use-spot
 
 For reference, below is a loss graph you may expect to see, and the amount of time and the approximate cost of fine-tuning each of the models over 500 epochs (assuming a spot instance A100 GPU rate at $1.1 / hour and a A100-80GB rate of $1.61 / hour):
 
-<img width="524" alt="image" src="https://imgur.com/BDlHink.png">
+<img width="524" alt="image" src="https://i.imgur.com/BDlHink.png">
 
 1. `ybelkada/falcon-7b-sharded-bf16`: 2.5 to 3 hours using 1 A100 spot GPU; total cost ≈ $3.3.
 
diff --git a/llm/gpt-2/README.md b/llm/gpt-2/README.md
index bc9893fec5b..10fa2cf6998 100644
--- a/llm/gpt-2/README.md
+++ b/llm/gpt-2/README.md
@@ -28,14 +28,14 @@ Run the following command to start GPT-2 (124M) training on a GPU VM with 8 A100
 sky launch -c gpt2 gpt2.yaml
 ```
 
-![GPT-2 training with 8 A100 GPUs](https://imgur.com/v8SGpsF.png)
+![GPT-2 training with 8 A100 GPUs](https://i.imgur.com/v8SGpsF.png)
 
 Or, you can train the model with a single A100, by adding `--gpus A100`:
 ```bash
 sky launch -c gpt2 gpt2.yaml --gpus A100
 ```
 
-![GPT-2 training with a single A100](https://imgur.com/hN65g4r.png)
+![GPT-2 training with a single A100](https://i.imgur.com/hN65g4r.png)
 
 
 It is also possible to speed up the training of the model on 8 H100 (2.3x more tok/s than 8x A100s):
@@ -43,7 +43,7 @@ It is also possible to speed up the training of the model on 8 H100 (2.3x more t
 sky launch -c gpt2 gpt2.yaml --gpus H100:8
 ```
 
-![GPT-2 training with 8 H100](https://imgur.com/STbi80b.png)
+![GPT-2 training with 8 H100](https://i.imgur.com/STbi80b.png)
 
 ### Download logs and visualizations
 
@@ -54,7 +54,7 @@ scp -r gpt2:~/llm.c/log124M .
 We can visualize the training progress with the notebook provided in [llm.c](https://github.com/karpathy/llm.c/blob/master/dev/vislog.ipynb). (Note: we cut off the training after 10K steps, which already achieve similar validation loss as OpenAI GPT-2 checkpoint.)
 
 <div align="center">
-<img src="https://imgur.com/lskPEAQ.png" width="60%">
+<img src="https://i.imgur.com/lskPEAQ.png" width="60%">
 </div>
 
 > Yes! We are able to reproduce the training of GPT-2 (124M) on any cloud with SkyPilot.
diff --git a/llm/llama-2/README.md b/llm/llama-2/README.md
index d8f8151572e..4f1a8f60cae 100644
--- a/llm/llama-2/README.md
+++ b/llm/llama-2/README.md
@@ -94,6 +94,6 @@ You can also host the official FAIR model without using huggingface and gradio.
     ```
 
 3. Open http://localhost:7681 in your browser and start chatting!
-<img src="https://imgur.com/Ay8sDhG.png" alt="LLaMA chatbot running on the cloud via SkyPilot"/>
+<img src="https://i.imgur.com/Ay8sDhG.png" alt="LLaMA chatbot running on the cloud via SkyPilot"/>
 
 
diff --git a/llm/llama-3/README.md b/llm/llama-3/README.md
index d0c28dc93c6..ef19d94b5c0 100644
--- a/llm/llama-3/README.md
+++ b/llm/llama-3/README.md
@@ -5,7 +5,7 @@
 
 
 <p align="center">
-<img src="https://imgur.com/1NEZs9f.png" alt="Llama-3 x SkyPilot" style="width: 50%;">
+<img src="https://i.imgur.com/1NEZs9f.png" alt="Llama-3 x SkyPilot" style="width: 50%;">
 </p>
 
 [Llama-3](https://github.com/meta-llama/llama3) is the latest top open-source LLM from Meta. It has been released with a license that authorizes commercial use. You can deploy a private Llama-3 chatbot with SkyPilot in your own cloud with just one simple command.
@@ -248,7 +248,7 @@ To use the Gradio UI, open the URL shown in the logs:
 
 
 <p align="center">
-<img src="https://imgur.com/zPpY2Bg.gif" alt="Gradio UI serving Llama-3" style="width: 80%;">
+<img src="https://i.imgur.com/zPpY2Bg.gif" alt="Gradio UI serving Llama-3" style="width: 80%;">
 </p>
 
 To stop the instance:
diff --git a/llm/llama-3_1-finetuning/readme.md b/llm/llama-3_1-finetuning/readme.md
index 836f3bf1b3b..935dccde84e 100644
--- a/llm/llama-3_1-finetuning/readme.md
+++ b/llm/llama-3_1-finetuning/readme.md
@@ -135,7 +135,7 @@ sky launch -c llama31 lora.yaml \
 
 <figure>
 <center>
-<img src="https://imgur.com/B7Ib4Ii.png" width="60%" />
+<img src="https://i.imgur.com/B7Ib4Ii.png" width="60%" />
 
      
 <figcaption>Training Loss of LoRA finetuning Llama 3.1</figcaption>
@@ -218,10 +218,10 @@ run: |
 
 ## Appendix: Preparation
 1. Request the access to [Llama 3.1 weights on huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) (Click on the blue box and follow the steps):
-![](https://imgur.com/snIQhr9.png)
+![](https://i.imgur.com/snIQhr9.png)
 
 2. Get your [huggingface access token](https://huggingface.co/settings/tokens):
-![](https://imgur.com/3idBgHn.png)
+![](https://i.imgur.com/3idBgHn.png)
 
 
 3. Add huggingface token to your environment variable:
diff --git a/llm/lorax/README.md b/llm/lorax/README.md
index 2fe548c92a8..6cc44cf1134 100644
--- a/llm/lorax/README.md
+++ b/llm/lorax/README.md
@@ -4,7 +4,7 @@
 <!-- $UNCOMMENT# LoRAX: Multi-LoRA Inference Server -->
 
 <p align="center">
-    <img src="https://imgur.com/OUapRYC.png" alt="LoRAX" style="width:200px;" />
+    <img src="https://i.imgur.com/OUapRYC.png" alt="LoRAX" style="width:200px;" />
 </p>
 
 [LoRAX](https://github.com/predibase/lorax) (LoRA eXchange) is a framework that allows users to serve thousands of fine-tuned LLMs on a single GPU, dramatically reducing the cost of serving without compromising on throughput or latency. It works by dynamically loading multiple fine-tuned "adapters" (LoRAs, etc.) on top of a single base model at runtime. Concurrent requests for different adapters can be processed together in a single batch, allowing LoRAX to maintain near linear throughput scaling as the number of adapters increases.
diff --git a/llm/vicuna-llama-2/README.md b/llm/vicuna-llama-2/README.md
index 899792c299d..24caa525a56 100644
--- a/llm/vicuna-llama-2/README.md
+++ b/llm/vicuna-llama-2/README.md
@@ -1,6 +1,6 @@
 # Train Your Own Vicuna on Llama-2
 
-![Vicuna-Llama-2](https://imgur.com/McZWg6z.gif "Result model in action, trained using this guide. From the SkyPilot and Vicuna teams.")
+![Vicuna-Llama-2](https://i.imgur.com/McZWg6z.gif "Result model in action, trained using this guide. From the SkyPilot and Vicuna teams.")
 
 Meta released [Llama 2](https://ai.meta.com/llama/) two weeks ago and has made a big wave in the AI community. In our opinion, its biggest impact is that the model is now released under a [permissive license](https://github.com/facebookresearch/llama/blob/main/LICENSE) that **allows the model weights to be used commercially**[^1]. This differs from Llama 1 which cannot be used commercially.
 
@@ -106,7 +106,7 @@ sky launch --no-use-spot ...
 
 
 <p align="center">
-    <img src="https://imgur.com/yVIXfQo.gif" width="100%" alt="Optimizer"/>
+    <img src="https://i.imgur.com/yVIXfQo.gif" width="100%" alt="Optimizer"/>
 </p>
 
 **Optional**: Try out the training for the 13B model:
@@ -139,7 +139,7 @@ sky launch -c serve serve.yaml --env MODEL_CKPT=<your-model-checkpoint>/chatbot/
 ```
 In [serve.yaml](https://github.com/skypilot-org/skypilot/tree/master/llm/vicuna-llama-2/serve.yaml), we specified launching a Gradio server that serves the model checkpoint at `<your-model-checkpoint>/chatbot/7b`.
 
-![Vicuna-Llama-2](https://imgur.com/McZWg6z.gif "Serving the resulting model with Gradio.")
+![Vicuna-Llama-2](https://i.imgur.com/McZWg6z.gif "Serving the resulting model with Gradio.")
 
 
 > **Tip**: You can also switch to a cheaper accelerator, such as L4, to save costs, by adding `--gpus L4` to the above command.
diff --git a/llm/vllm/README.md b/llm/vllm/README.md
index e3a2befbecc..9fb3c0c1364 100644
--- a/llm/vllm/README.md
+++ b/llm/vllm/README.md
@@ -4,7 +4,7 @@
 <!-- $UNCOMMENT# vLLM: Easy, Fast, and Cheap LLM Inference -->
 
 <p align="center">
-    <img src="https://imgur.com/yxtzPEu.png" alt="vLLM"/>
+    <img src="https://i.imgur.com/yxtzPEu.png" alt="vLLM"/>
 </p>
 
 This README contains instructions to run a demo for vLLM, an open-source library for fast LLM inference and serving, which improves the throughput compared to HuggingFace by **up to 24x**.