This repository consists of various computer vision and Multimodal AI projects. The projects focus on various fields of CV such as:
Vision Language Models and Multimodal AI
Stable Diffusion (Text-To-Image Generation)
Video Summarization
Object Tracking (Face Tracking & Lip Tracking)
Object Detection
Semantic Segmentation and Segment Anything
Realtime Pose Estimation
The tools in which I gained proficiency while working on these projects are:
PyTorch
TensorFlow
OpenCV
OpenVino
Transformers
OpenAI
Diffusers