Skip to content

Commit

Permalink
[Docs Agent] Release of Docs Agent v 0.3.2.
Browse files Browse the repository at this point in the history
What's change:

- Support the new Gemini 1.5 models in preview.
- Add new experimental CLI features to interact with Gemini models
  directly from a Linux terminal: `agent tellme` and `agent helpme`.
- Better handle uploading of text chunks using the Semantic Retrieval API.
- Add a new chat UI feature to provide a page for viewing logs.
- Update the Google Generative AI SDK version to `0.5.0`.
- Refactor the pre-preprocessing module (in progress).
- Remove unused code and type mismatch errors.
- Bug fixes.
  • Loading branch information
kyolee415 committed Apr 12, 2024
1 parent 1294973 commit c8529d0
Show file tree
Hide file tree
Showing 30 changed files with 3,243 additions and 1,458 deletions.
64 changes: 40 additions & 24 deletions examples/gemini/python/docs-agent/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,22 @@ The following list summarizes the tasks and features supported by Docs Agent:
- **Run the Docs Agent CLI from anywhere in a terminal**: You can set up the
Docs Agent CLI to ask questions to the Gemini model from anywhere in a terminal.
For more information, see the [Set up Docs Agent CLI][cli-readme] page.
- **Support the Gemini 1.5 models**: You can use the new Gemini 1.5 models,
`gemini-1.5-pro-latest` and `text-embedding-004`, with Docs Agent today.
For the moment, the following `config.yaml` setup is recommended:

```
models:
- language_model: "models/aqa"
embedding_model: "models/text-embedding-004"
api_endpoint: "generativelanguage.googleapis.com"
...
app_mode: "1.5"
db_type: "chroma"
```

The setup above uses 3 Gemini models to their strength: AQA (`aqa`),
Gemini 1.0 Pro (`gemini-pro`), and Gemini 1.5 Pro (`gemini-1.5-pro-latest`).

For more information on Docs Agent's architecture and features,
see the [Docs Agent concepts][docs-agent-concepts] page.
Expand Down Expand Up @@ -113,27 +129,24 @@ Update your host machine's environment to prepare for the Docs Agent setup:
2. Install the following dependencies:

```posix-terminal
sudo apt install git pip python3-venv
sudo apt install git pipx python3-venv
```

3. Install `poetry`:

```posix-terminal
curl -sSL https://install.python-poetry.org | python3 -
pipx install poetry
```

**Important**: Make sure that `$HOME/.local/bin` is in your `PATH` variable
(for example, `export PATH=$PATH:~/.local/bin`).

4. Set the following environment variable:
4. To add `$HOME/.local/bin` to your `PATH` variable, run the following
command:

```posix-terminal
export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring
pipx ensurepath
```

This is a [known issue][poetry-known-issue] in `poetry`.

5. Set the Google API key as a environment variable:
5. To set the Google API key as a environment variable, add the following
line to your `$HOME/.bashrc` file:

```
export GOOGLE_API_KEY=<YOUR_API_KEY_HERE>
Expand All @@ -142,8 +155,11 @@ Update your host machine's environment to prepare for the Docs Agent setup:
Replace `<YOUR_API_KEY_HERE>` with the API key to the
[Gemini API][genai-doc-site].

**Tip**: To avoid repeating these `export` lines, add them to your
`$HOME/.bashrc` file.
6. Update your environment:

```posix-termainl
source ~/.bashrc
```

### 3. (Optional) Authorize credentials for Docs Agent

Expand Down Expand Up @@ -256,7 +272,7 @@ Update settings in the Docs Agent project to use your custom dataset:

```
inputs:
- path: "/usr/local/home/user01/website/src"
- path: "/usr/local/home/user01/website/src/content"
url_prefix: "https://docs.flutter.dev"
```

Expand All @@ -265,30 +281,31 @@ Update settings in the Docs Agent project to use your custom dataset:

```
inputs:
- path: "/usr/local/home/user01/website/src/ui"
- path: "/usr/local/home/user01/website/src/content/ui"
url_prefix: "https://docs.flutter.dev/ui"
- path: "/usr/local/home/user01/website/src/tools"
- path: "/usr/local/home/user01/website/src/content/tools"
url_prefix: "https://docs.flutter.dev/tools"
```

6. (**Optional**) If you want to use the Gemini AQA model and populate a corpus online
via the [Semantic Retrieval API][semantic-api], use the following settings:
6. If you want to use the `gemini-pro` model with a local vector database setup
(`chroma`), use the following settings:

```
models:
- language_model: "models/aqa"
- language_model: "models/gemini-pro"
...
db_type: "google_semantic_retriever"
db_type: "chroma"
```

Or if you want to use the `gemini-pro` model with a local vector database setup
(`chroma`), use the following settings:
(**Optional**) Or if you want to use the Gemini AQA model and populate
a corpus online via the [Semantic Retrieval API][semantic-api], use the
following settings:

```
models:
- language_model: "models/gemini-pro"
- language_model: "models/aqa"
...
db_type: "chroma"
db_type: "google_semantic_retriever"
```

7. Save the `config.yaml` file and exit the text editor.
Expand Down Expand Up @@ -403,7 +420,6 @@ Meggin Kearney (`@Meggin`), and Kyo Lee (`@kyolee415`).
[chroma-docs]: https://docs.trychroma.com/
[flutter-docs-src]: https://github.com/flutter/website/tree/main/src
[flutter-docs-site]: https://docs.flutter.dev/
[poetry-known-issue]: https://github.com/python-poetry/poetry/issues/1917
[apps-script-readme]: ./apps_script/README.md
[scripts-readme]: ./docs_agent/preprocess/README.md
[config-yaml]: config.yaml
Expand Down
13 changes: 10 additions & 3 deletions examples/gemini/python/docs-agent/apps_script/drive_to_markdown.gs
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,19 @@ function convertDriveFolderToMDForDocsAgent(folderName) {

while (myfiles.hasNext()) {
var myfile = myfiles.next();
var ftype = myfile.getMimeType();
// If this is a shorcut, retrieve the target file
if (ftype == "application/vnd.google-apps.shortcut") {
var fid = myfile.getTargetId();
var myfile = DriveApp.getFileById(fid);
var ftype = myfile.getMimeType();
}
else{
var fid = myfile.getId();
}
var fname = sanitizeFileName(myfile.getName());
var fdate = myfile.getLastUpdated();
var furl = myfile.getUrl();
var fid = myfile.getId();
var ftype = myfile.getMimeType();
var fcreate = myfile.getDateCreated();

//Function returns an array, assign each array value to seperate variables
Expand All @@ -58,7 +66,6 @@ function convertDriveFolderToMDForDocsAgent(folderName) {
var md5_backup = backup_results[1];
var mdoutput_backup_id = backup_results[2];
}

if (ftype == "application/vnd.google-apps.document") {
Logger.log("File: " + fname + " is a Google doc.");
let gdoc = DocumentApp.openById(fid);
Expand Down
6 changes: 4 additions & 2 deletions examples/gemini/python/docs-agent/apps_script/exportmd.gs
Original file line number Diff line number Diff line change
Expand Up @@ -866,6 +866,8 @@ function processParagraph(index, element, inSrc, imageCounter, listCounters, ima
} else if (t === DocumentApp.ElementType.FOOTNOTE) {
textElements.push(' ('+element.getChild(i).getFootnoteContents().getText()+')');
// Fixes for new elements
} else if (t === DocumentApp.ElementType.EQUATION) {
textElements.push(element.getChild(i).getText());
} else if (t === DocumentApp.ElementType.DATE) {
textElements.push(' ('+element.getChild(i)+')');
} else if (t === DocumentApp.ElementType.RICH_LINK) {
Expand All @@ -875,8 +877,8 @@ function processParagraph(index, element, inSrc, imageCounter, listCounters, ima
} else if (t === DocumentApp.ElementType.UNSUPPORTED) {
textElements.push(' <UNSUPPORTED> ');
} else {
throw "Paragraph "+index+" of type "+element.getType()+" has an unsupported child: "
+t+" "+(element.getChild(i)["getText"] ? element.getChild(i).getText():'')+" index="+index;
Logger.log("Paragraph "+index+" of type "+element.getType()+" has an unsupported child: "
+t+" "+(element.getChild(i)["getText"] ? element.getChild(i).getText():'')+" index="+index);
}
}

Expand Down
77 changes: 56 additions & 21 deletions examples/gemini/python/docs-agent/docs_agent/agents/docs_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,26 +16,21 @@

"""Docs Agent"""

import os
import sys
import typing

from absl import logging
import google.api_core
import google.ai.generativelanguage as glm
from chromadb.utils import embedding_functions

from docs_agent.storage.chroma import Chroma, Format, ChromaEnhanced
from docs_agent.models.palm import PaLM
from docs_agent.storage.chroma import ChromaEnhanced

from docs_agent.models.google_genai import Gemini

from docs_agent.utilities.config import ProductConfig, ReadConfig, Input, Models
from docs_agent.models import tokenCount
from docs_agent.utilities.config import ProductConfig, Models
from docs_agent.preprocess.splitters import markdown_splitter

from docs_agent.preprocess.splitters.markdown_splitter import Section as Section
from docs_agent.utilities.helpers import get_project_path
from docs_agent.postprocess.docs_retriever import FullPage as FullPage
from docs_agent.postprocess.docs_retriever import SectionDistance as SectionDistance
from docs_agent.postprocess.docs_retriever import (
SectionProbability as SectionProbability,
Expand All @@ -49,6 +44,8 @@ class DocsAgent:
def __init__(self, config: ProductConfig, init_chroma: bool = True):
# Models settings
self.config = config
self.embedding_model = str(self.config.models.embedding_model)
self.api_endpoint = str(self.config.models.api_endpoint)
# Use the new chroma db for all queries
# Should make a function for this or clean this behavior
if init_chroma:
Expand All @@ -62,9 +59,9 @@ def __init__(self, config: ProductConfig, init_chroma: bool = True):
)
self.collection = self.chroma.get_collection(
self.collection_name,
embedding_model=self.config.models.embedding_model,
embedding_model=self.embedding_model,
embedding_function=embedding_function_gemini_retrieval(
self.config.models.api_key
self.config.models.api_key, self.embedding_model
),
)
# AQA model settings
Expand All @@ -77,9 +74,12 @@ def __init__(self, config: ProductConfig, init_chroma: bool = True):
self.context_model = "models/gemini-pro"
gemini_model_config = Models(
language_model=self.context_model,
embedding_model="models/embedding-001",
embedding_model=self.embedding_model,
api_endpoint=self.api_endpoint,
)
self.gemini = Gemini(
models_config=gemini_model_config, conditions=config.conditions
)
self.gemini = Gemini(models_config=gemini_model_config)
# Semantic retriever
if self.config.db_type == "google_semantic_retriever":
for item in self.config.db_configs:
Expand All @@ -93,9 +93,34 @@ def __init__(self, config: ProductConfig, init_chroma: bool = True):
)
self.aqa_response_buffer = ""

if self.config.models.language_model == "models/gemini-pro":
self.gemini = Gemini(models_config=config.models)
self.context_model = "models/gemini-pro"
if self.config.models.language_model.startswith("models/gemini"):
self.gemini = Gemini(
models_config=config.models, conditions=config.conditions
)
self.context_model = self.config.models.language_model

# Always initialize the gemini-pro model for other tasks.
gemini_pro_model_config = Models(
language_model="models/gemini-pro",
embedding_model=self.embedding_model,
api_endpoint=self.api_endpoint,
)
self.gemini_pro = Gemini(
models_config=gemini_pro_model_config, conditions=config.conditions
)

if self.config.app_mode == "1.5":
# Initialize the gemini-1.5.pro model for summarization.
gemini_15_model_config = Models(
language_model="models/gemini-1.5-pro-latest",
embedding_model=self.embedding_model,
api_endpoint=self.api_endpoint,
)
self.gemini_15 = Gemini(
models_config=gemini_15_model_config, conditions=config.conditions
)
else:
self.gemini_15 = self.gemini_pro

# Use this method for talking to a Gemini content model
def ask_content_model_with_context(self, context, question):
Expand Down Expand Up @@ -261,7 +286,7 @@ def ask_aqa_model_using_corpora(self, question, answer_style: str = "VERBOSE"):

def ask_aqa_model(self, question):
response = ""
if self.db_type == "ONLINE_STORAGE":
if self.config.db_type == "google_semantic_retriever":
response = self.ask_aqa_model_using_corpora(question)
else:
response = self.ask_aqa_model_using_local_vector_store(question)
Expand Down Expand Up @@ -436,7 +461,11 @@ def query_vector_store_to_build(
# If prompt is "fact_checker" it will use the fact_check_question from
# config.yaml for the prompt
def ask_content_model_with_context_prompt(
self, context: str, question: str, prompt: str = None
self,
context: str,
question: str,
prompt: typing.Optional[str] = None,
model: typing.Optional[str] = None,
):
if prompt == None:
prompt = self.config.conditions.condition_text
Expand All @@ -447,7 +476,13 @@ def ask_content_model_with_context_prompt(
if self.config.log_level == "VERBOSE":
self.print_the_prompt(new_prompt)
try:
response = self.gemini.generate_content(contents=new_prompt)
response = ""
if model == "gemini-pro":
response = self.gemini_pro.generate_content(contents=new_prompt)
elif model == "gemini-1.5-pro":
response = self.gemini_15.generate_content(contents=new_prompt)
else:
response = self.gemini.generate_content(contents=new_prompt)
except:
return self.config.conditions.model_error_message, new_prompt
for chunk in response:
Expand Down Expand Up @@ -475,7 +510,7 @@ def ask_content_model_to_use_file(self, prompt: str, file: str):
# Use this method for asking a Gemini content model for fact-checking.
# This uses ask_content_model_with_context_prompt w
def ask_content_model_to_fact_check_prompt(self, context: str, prev_response: str):
question = self.fact_check_question + "\n\nText: "
question = self.config.conditions.fact_check_question + "\n\nText: "
question += prev_response
return self.ask_content_model_with_context_prompt(
context=context, question=question, prompt=""
Expand All @@ -487,7 +522,7 @@ def generate_embedding(self, text, task_type: str = "SEMANTIC_SIMILARITY"):


# Function to give an embedding function for gemini using an API key
def embedding_function_gemini_retrieval(api_key):
def embedding_function_gemini_retrieval(api_key, embedding_model: str):
return embedding_functions.GoogleGenerativeAiEmbeddingFunction(
api_key=api_key, model_name="models/embedding-001", task_type="RETRIEVAL_QUERY"
api_key=api_key, model_name=embedding_model, task_type="RETRIEVAL_QUERY"
)
Loading

0 comments on commit c8529d0

Please sign in to comment.