For use Visual Chat Gpt need a API key and billing account. This is a repository of the original TaskMatrix, with edited source code, and also, supplemented with a more detailed guide to installing and running it. It will be updated. p.s. This is a simplified guide exclusively for beginners.
Instructions for installing from scratch, follow step by step to get everything working for you:
- Install the
Python
. - Install
Nvidia Cuda version 11.8
. - Install
Anaconda
(optional). - Edit file
Setup.bat
, and changeset OPENAI_API_KEY={Your_Private_Openai_Key}
for your ChatGpt Api Key. - Run
Setup.bat
. - After Finish setup, run
Run.bat
.
Then, if Visual ChatGpt started successfully, you can see message like Running on local URL: http://0.0.0.0:7861
- this is the IP, and you can navigate to it to work with the AI. You may have to add 127.0.0.1
instead of 0.0.0.0.
Also, u can change local ip address for you want, or make that url public (Need edit visual_chatgpt.py
).
If you take some errors, open the new issues, and i try to help you.
p.s.This is a quick guide, and there will be no explanation or description of the purpose of individual AI models here
You can also use different startup parameters for different needs ( just edit Run.bat
), for example:
- For CPU users
--load ImageCaptioning_cpu,Text2Image_cpu
- For GPU users
--load "ImageCaptioning_cuda:0,Text2Image_cuda:0"
- Run on gpu with more models
--load "Text2Box_cuda:0,Segmenting_cuda:0,Inpainting_cuda:0,ImageCaptioning_cuda:0"
Etc.
Here we list the GPU memory usage of each visual foundation model, you can specify which one you like:
Foundation Model | GPU Memory (MB) |
---|---|
ImageEditing | 3981 |
InstructPix2Pix | 2827 |
Text2Image | 3385 |
ImageCaptioning | 1209 |
Image2Canny | 0 |
CannyText2Image | 3531 |
Image2Line | 0 |
LineText2Image | 3529 |
Image2Hed | 0 |
HedText2Image | 3529 |
Image2Scribble | 0 |
ScribbleText2Image | 3531 |
Image2Pose | 0 |
PoseText2Image | 3529 |
Image2Seg | 919 |
SegText2Image | 3529 |
Image2Depth | 0 |
DepthText2Image | 3531 |
Image2Normal | 0 |
NormalText2Image | 3529 |
VisualQuestionAnswering | 1495 |
(c) Microsoft - https://github.com/microsoft/TaskMatrix