Presento: AI Presentation and Podcast Maker

Overview

This application leverages the power of Vertex AI Agents API and the Imagen text-to-image model to generate presentations and podcasts based on a user-provided topic.

Sample PDF	Sample Podcast	Try Locally
Try in Notebook

Features

Vertex Agents for Content Creation:
- Story Generation: Creates a comprehensive story related to the user's topic.
- Slide Generation: Structures the story into a basic slide deck format.
- Slide Refinement: Iteratively improves the slide deck's clarity, conciseness, and engagement.
- Image Description Generation: Creates detailed prompts for image generation tailored to each slide's content.
- JSON Conversion: Transforms the refined slide deck text into a structured JSON format for easier processing.
- Podcast Generation: Generates a podcast conversation between a host and a guest expert based on the slide content.
Imagen for Visuals: Generates relevant and engaging images for each slide based on the AI-generated descriptions.
PDF Generation: Compiles the slides, including titles, descriptions, key takeaways, and images, into a downloadable PDF presentation.
Podcast Synthesis: Synthesizes the generated podcast conversation using Google Cloud Text-to-Speech, with different voices for the host and guest.
Gradio Interface: Provides a user-friendly interface for topic input, refinement level selection, and presentation and podcast download.

Architecture

graph LR
    A[User Input] --> B(Story Generation)
    B --> C(Slide Generation & Refinement)
    C --> D(JSON Conversion)
    D --> E(Image Description & Generation)
    E --> F(Podcast Generation)
    F --> G(PDF and Podcast Generation)

User Input: The user provides a presentation topic through the Gradio interface.
Story Generation: A Gemini agent generates a detailed story relevant to the topic.
Slide Generation & Refinement: Another agent converts the story into a slide deck, which is then refined iteratively by a refinement agent.
JSON Conversion: The refined slide deck is converted to JSON format.
Image Description & Generation: An agent creates image descriptions for each slide. Imagen then uses these descriptions to generate relevant images.
Podcast Generation: An agent generates a podcast conversation based on the slide deck content.
PDF and Podcast Generation: The application creates a PDF presentation from the slides and images and synthesizes the podcast audio using Google Cloud Text-to-Speech.

Getting Started

Set up
- Set the PROJECT_ID and LOCATION variables in the code.
- Enable the Vertex AI API, Text-to-Speech API.
Install Dependencies:
```
pip install -r requirements.txt
```
Run the Application:
```
gradio main.py 
```
Deploy to Cloud Run

gcloud run deploy presento  --project <PROJECT_ID>  --port 8080 --region us-central1 --min-instances 1

Please note some of the services here are either private preview,public preview and GA. Please refer documentation for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
GenAI4UnitTests.pdf		GenAI4UnitTests.pdf
LICENSE		LICENSE
Learn2Learn.gif		Learn2Learn.gif
Notebook.ipynb		Notebook.ipynb
README.md		README.md
main.py		main.py
podcast-big-bang-theory.mp4		podcast-big-bang-theory.mp4
podcast-img.jpg		podcast-img.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Presento: AI Presentation and Podcast Maker

Overview

Features

Architecture

Getting Started

About

Releases

Packages

Languages

License

krishnaji/presento

Folders and files

Latest commit

History

Repository files navigation

Presento: AI Presentation and Podcast Maker

Overview

Features

Architecture

Getting Started

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages