This repository contains a collection of AI applications built using the GPT-4o model, which is capable of reading images, identifying text and objects within them, and performing actions based on prompts.
- Image Recognition: The GPT-4o model can analyze images and identify various objects, text, and other elements present within them.
- Text Extraction: The model can extract and recognize text from images, making it useful for tasks like OCR (Optical Character Recognition).
- Prompt-based Actions: Users can provide prompts or instructions, and the model will perform the requested actions based on the image and text analysis.
- Clone the repository:
git clone https://github.com/your-username/ai-apps-gpt-4o.git
- Install the required dependencies:
npm install
- update .env file with OpenAI API KEY:
OPENAI_API_KEY=<YOUR API KEY>
- Run the desired application:
node <file_name>
This repository includes the following AI applications:
- Image Captioning: Generates descriptive captions for input images.
- Text Recognition: Extracts and recognizes text from images.
- Object Detection: Identifies and locates objects within images.
- Document Processing: Processes and extracts information from documents (e.g., invoices, receipts).
- Visual Question Answering: Answers questions based on the content of an image.
Contributions are welcome! If you have any ideas, improvements, or bug fixes, please open an issue or submit a pull request.
This project is free for everyone.