Prototype framework for LLM based captioning and prompt engineering experiments for artists and designers
Select an agent, a model, and provide input to get a response from any model served by Ollama. You can provide a folder path of images for multimodal input. Create prompts and combine the text with visuals for captioning or image generation.
Use either a folder of images (1) or one image (2) to experiment with text or multimodal LLMs. Fine-tune your text prompt output on one image or caption folders using agents in dedicated roles and user input (3). Comment (4) on the output to change the result color scheme, materials, or a setting. Try limiters to adjust the prompts even more.
- Install Ollama https://ollama.com/
- Download/clone ArtAgents
- (optional) Run setupvenv.bat. (you may then run ArtAgents in venv with govenv.bat)
- (optional) Run setup.bat to setup ollama models
- Run ollama in terminal, use
olama list
to see full names of models you have locally (you need to enter the full name as displayed in Ollama!) - Enter your models into models.json. Set "vision": true for multimodal LLMs
- Run ArtAgents with go.bat or govenv.bat
olama pull llava
olama list
shows ollama:latest name. The model has VISION, so put it into models.json (this model is already there actually).- The setting should work with http://localhost:11434/api/generate (the setting is in agent.py), ArtAgents runs via API
- Select an Agent depending on what scene you want to focus on
- Have fun
- If there is no folder path or image inserted, the prompt is created by User and Agents without any image just based on User input. It is meant for quick sketches of prompts and for work with text only LLMs
- The goal was to create a tool for unusual captioning style experiments and prompt engineering tests for generative models
- Working prototype
- Editable Default Agents
- Custom Agents Properties
- Custom Model Limiters and Agents (for a specific LLM)
- Use Ollama API Options (full functionality)
- API parameters finetune, profiles
- Comment Input
- Chat History
- Multimodal Chat History
- Agent Training
sandner.art | AI/ML Articles
ArtAgents by Daniel Sandner ©2024. You may use the software or adapt it for your creative work any way you choose. No guarantees.