Scalable LLM Architectures with Redis & GCP Vertex AI

☁️ Generative AI with Google Vertex AI comes with a specialized in-console studio experience, a dedicated API for Gemini and easy-to-use Python SDK designed for deploying and managing instances of Google's powerful language models.

⚡ Redis Enterprise offers fast and scalable vector search, with an API for index creation, management, blazing-fast search, and hybrid filtering. When coupled with its versatile data structures - Redis Enterprise shines as the optimal solution for building high-quality Large Language Model (LLM) apps.

This repo serves as a foundational architecture for building LLM applications with Redis and GCP services.

Reference architecture

Primary Data Sources
Data Extraction and Loading
Large Language Models
- text-embedding-gecko@003 for embeddings
- gemini-1.5-flash-001 for LLM generation and chat
High-Performance Data Layer (Redis)
- Semantic caching to improve LLM performance and associated costs
- Vector search for context retrieval from knowledge base

RAG demo

Open the code tutorial using the Colab notebook to get your hands dirty with Redis and Vertex AI on GCP. It's a step-by-step walkthrough of setting up the required data, and generating embeddings, and building RAG from scratch in order to build fast LLM apps; highlighting Redis vector search and semantic caching.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
examples		examples
.gitignore		.gitignore
Gemini_Redis.ipynb		Gemini_Redis.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scalable LLM Architectures with Redis & GCP Vertex AI

Reference architecture

RAG demo

Additional resources

About

Releases

Packages

Contributors 3

Languages

License

redis-developer/gcp-redis-llm-stack

Folders and files

Latest commit

History

Repository files navigation

Scalable LLM Architectures with Redis & GCP Vertex AI

Reference architecture

RAG demo

Additional resources

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages