[Docs Agent] Release of Docs Agent v 0.3.2.

What's change: - Support the new Gemini 1.5 models in preview. - Add new experimental CLI features to interact with Gemini models directly from a Linux terminal: `agent tellme` and `agent helpme`. - Better handle uploading of text chunks using the Semantic Retrieval API. - Add a new chat UI feature to provide a page for viewing logs. - Update the Google Generative AI SDK version to `0.5.0`. - Refactor the pre-preprocessing module (in progress). - Remove unused code and type mismatch errors. - Bug fixes.
kyolee415 · Apr 12, 2024 · c8529d0 · c8529d0
1 parent 1294973
commit c8529d0
Show file tree

Hide file tree

Showing 30 changed files with 3,243 additions and 1,458 deletions.
diff --git a/examples/gemini/python/docs-agent/README.md b/examples/gemini/python/docs-agent/README.md
@@ -70,6 +70,22 @@ The following list summarizes the tasks and features supported by Docs Agent:
 - **Run the Docs Agent CLI from anywhere in a terminal**: You can set up the
   Docs Agent CLI to ask questions to the Gemini model from anywhere in a terminal.
   For more information, see the [Set up Docs Agent CLI][cli-readme] page.
+- **Support the Gemini 1.5 models**: You can use the new Gemini 1.5 models,
+  `gemini-1.5-pro-latest` and `text-embedding-004`, with Docs Agent today.
+  For the moment, the following `config.yaml` setup is recommended:
+
+  ```
+  models:
+  - language_model: "models/aqa"
+    embedding_model: "models/text-embedding-004"
+    api_endpoint: "generativelanguage.googleapis.com"
+  ...
+  app_mode: "1.5"
+  db_type: "chroma"
+  ```
+
+  The setup above uses 3 Gemini models to their strength: AQA (`aqa`),
+  Gemini 1.0 Pro (`gemini-pro`), and Gemini 1.5 Pro (`gemini-1.5-pro-latest`).
 
 For more information on Docs Agent's architecture and features,
 see the [Docs Agent concepts][docs-agent-concepts] page.
@@ -113,27 +129,24 @@ Update your host machine's environment to prepare for the Docs Agent setup:
 2. Install the following dependencies:
 
    ```posix-terminal
-   sudo apt install git pip python3-venv
+   sudo apt install git pipx python3-venv
    ```
 
 3. Install `poetry`:
 
    ```posix-terminal
-   curl -sSL https://install.python-poetry.org | python3 -
+   pipx install poetry
    ```
 
-   **Important**: Make sure that `$HOME/.local/bin` is in your `PATH` variable
-   (for example, `export PATH=$PATH:~/.local/bin`).
-
-4. Set the following environment variable:
+4. To add `$HOME/.local/bin` to your `PATH` variable, run the following
+   command:
 
    ```posix-terminal
-   export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring
+   pipx ensurepath
    ```
 
-   This is a [known issue][poetry-known-issue] in `poetry`.
-
-5. Set the Google API key as a environment variable:
+5. To set the Google API key as a environment variable, add the following
+   line to your `$HOME/.bashrc` file:
 
    ```
    export GOOGLE_API_KEY=<YOUR_API_KEY_HERE>
@@ -142,8 +155,11 @@ Update your host machine's environment to prepare for the Docs Agent setup:
    Replace `<YOUR_API_KEY_HERE>` with the API key to the
    [Gemini API][genai-doc-site].
 
-   **Tip**: To avoid repeating these `export` lines, add them to your
-   `$HOME/.bashrc` file.
+6. Update your environment:
+
+   ```posix-termainl
+   source ~/.bashrc
+   ```
 
 ### 3. (Optional) Authorize credentials for Docs Agent
 
@@ -256,7 +272,7 @@ Update settings in the Docs Agent project to use your custom dataset:
 
    ```
    inputs:
-     - path: "/usr/local/home/user01/website/src"
+     - path: "/usr/local/home/user01/website/src/content"
        url_prefix: "https://docs.flutter.dev"
    ```
 
@@ -265,30 +281,31 @@ Update settings in the Docs Agent project to use your custom dataset:
 
    ```
    inputs:
-     - path: "/usr/local/home/user01/website/src/ui"
+     - path: "/usr/local/home/user01/website/src/content/ui"
        url_prefix: "https://docs.flutter.dev/ui"
-     - path: "/usr/local/home/user01/website/src/tools"
+     - path: "/usr/local/home/user01/website/src/content/tools"
        url_prefix: "https://docs.flutter.dev/tools"
    ```
 
-6. (**Optional**) If you want to use the Gemini AQA model and populate a corpus online
-   via the [Semantic Retrieval API][semantic-api], use the following settings:
+6. If you want to use the `gemini-pro` model with a local vector database setup
+   (`chroma`), use the following settings:
 
    ```
    models:
-     - language_model: "models/aqa"
+     - language_model: "models/gemini-pro"
    ...
-   db_type: "google_semantic_retriever"
+   db_type: "chroma"
    ```
 
-   Or if you want to use the `gemini-pro` model with a local vector database setup
-   (`chroma`), use the following settings:
+   (**Optional**) Or if you want to use the Gemini AQA model and populate
+   a corpus online via the [Semantic Retrieval API][semantic-api], use the
+   following settings:
 
    ```
    models:
-     - language_model: "models/gemini-pro"
+     - language_model: "models/aqa"
    ...
-   db_type: "chroma"
+   db_type: "google_semantic_retriever"
    ```
 
 7. Save the `config.yaml` file and exit the text editor.
@@ -403,7 +420,6 @@ Meggin Kearney (`@Meggin`), and Kyo Lee (`@kyolee415`).
 [chroma-docs]: https://docs.trychroma.com/
 [flutter-docs-src]: https://github.com/flutter/website/tree/main/src
 [flutter-docs-site]: https://docs.flutter.dev/
-[poetry-known-issue]: https://github.com/python-poetry/poetry/issues/1917
 [apps-script-readme]: ./apps_script/README.md
 [scripts-readme]: ./docs_agent/preprocess/README.md
 [config-yaml]: config.yaml

diff --git a/examples/gemini/python/docs-agent/apps_script/drive_to_markdown.gs b/examples/gemini/python/docs-agent/apps_script/drive_to_markdown.gs
@@ -44,11 +44,19 @@ function convertDriveFolderToMDForDocsAgent(folderName) {
 
     while (myfiles.hasNext()) {
       var myfile = myfiles.next();
+      var ftype = myfile.getMimeType();
+      // If this is a shorcut, retrieve the target file
+      if (ftype == "application/vnd.google-apps.shortcut") {
+        var fid = myfile.getTargetId();
+        var myfile = DriveApp.getFileById(fid);
+        var ftype = myfile.getMimeType();
+      }
+      else{
+        var fid = myfile.getId();
+      }
       var fname = sanitizeFileName(myfile.getName());
       var fdate = myfile.getLastUpdated();
       var furl = myfile.getUrl();
-      var fid = myfile.getId();
-      var ftype = myfile.getMimeType();
       var fcreate = myfile.getDateCreated();
 
       //Function returns an array, assign each array value to seperate variables
@@ -58,7 +66,6 @@ function convertDriveFolderToMDForDocsAgent(folderName) {
         var md5_backup = backup_results[1];
         var mdoutput_backup_id = backup_results[2];
       }
-
       if (ftype == "application/vnd.google-apps.document") {
         Logger.log("File: " + fname + " is a Google doc.");
         let gdoc = DocumentApp.openById(fid);

diff --git a/examples/gemini/python/docs-agent/apps_script/exportmd.gs b/examples/gemini/python/docs-agent/apps_script/exportmd.gs
@@ -866,6 +866,8 @@ function processParagraph(index, element, inSrc, imageCounter, listCounters, ima
     } else if (t === DocumentApp.ElementType.FOOTNOTE) {
       textElements.push(' ('+element.getChild(i).getFootnoteContents().getText()+')');
     // Fixes for new elements
+    } else if (t === DocumentApp.ElementType.EQUATION) {
+      textElements.push(element.getChild(i).getText());
     } else if (t === DocumentApp.ElementType.DATE) {
       textElements.push(' ('+element.getChild(i)+')');
     } else if (t === DocumentApp.ElementType.RICH_LINK) {
@@ -875,8 +877,8 @@ function processParagraph(index, element, inSrc, imageCounter, listCounters, ima
     } else if (t === DocumentApp.ElementType.UNSUPPORTED) {
       textElements.push(' <UNSUPPORTED> ');
     } else {
-      throw "Paragraph "+index+" of type "+element.getType()+" has an unsupported child: "
-      +t+" "+(element.getChild(i)["getText"] ? element.getChild(i).getText():'')+" index="+index;
+      Logger.log("Paragraph "+index+" of type "+element.getType()+" has an unsupported child: "
+      +t+" "+(element.getChild(i)["getText"] ? element.getChild(i).getText():'')+" index="+index);
     }
   }
 

diff --git a/examples/gemini/python/docs-agent/docs_agent/agents/docs_agent.py b/examples/gemini/python/docs-agent/docs_agent/agents/docs_agent.py
@@ -16,26 +16,21 @@
 
 """Docs Agent"""
 
-import os
-import sys
+import typing
 
 from absl import logging
 import google.api_core
 import google.ai.generativelanguage as glm
 from chromadb.utils import embedding_functions
 
-from docs_agent.storage.chroma import Chroma, Format, ChromaEnhanced
-from docs_agent.models.palm import PaLM
+from docs_agent.storage.chroma import ChromaEnhanced
 
 from docs_agent.models.google_genai import Gemini
 
-from docs_agent.utilities.config import ProductConfig, ReadConfig, Input, Models
-from docs_agent.models import tokenCount
+from docs_agent.utilities.config import ProductConfig, Models
 from docs_agent.preprocess.splitters import markdown_splitter
 
 from docs_agent.preprocess.splitters.markdown_splitter import Section as Section
-from docs_agent.utilities.helpers import get_project_path
-from docs_agent.postprocess.docs_retriever import FullPage as FullPage
 from docs_agent.postprocess.docs_retriever import SectionDistance as SectionDistance
 from docs_agent.postprocess.docs_retriever import (
     SectionProbability as SectionProbability,
@@ -49,6 +44,8 @@ class DocsAgent:
     def __init__(self, config: ProductConfig, init_chroma: bool = True):
         # Models settings
         self.config = config
+        self.embedding_model = str(self.config.models.embedding_model)
+        self.api_endpoint = str(self.config.models.api_endpoint)
         # Use the new chroma db for all queries
         # Should make a function for this or clean this behavior
         if init_chroma:
@@ -62,9 +59,9 @@ def __init__(self, config: ProductConfig, init_chroma: bool = True):
             )
             self.collection = self.chroma.get_collection(
                 self.collection_name,
-                embedding_model=self.config.models.embedding_model,
+                embedding_model=self.embedding_model,
                 embedding_function=embedding_function_gemini_retrieval(
-                    self.config.models.api_key
+                    self.config.models.api_key, self.embedding_model
                 ),
             )
         # AQA model settings
@@ -77,9 +74,12 @@ def __init__(self, config: ProductConfig, init_chroma: bool = True):
             self.context_model = "models/gemini-pro"
             gemini_model_config = Models(
                 language_model=self.context_model,
-                embedding_model="models/embedding-001",
+                embedding_model=self.embedding_model,
+                api_endpoint=self.api_endpoint,
+            )
+            self.gemini = Gemini(
+                models_config=gemini_model_config, conditions=config.conditions
             )
-            self.gemini = Gemini(models_config=gemini_model_config)
         # Semantic retriever
         if self.config.db_type == "google_semantic_retriever":
             for item in self.config.db_configs:
@@ -93,9 +93,34 @@ def __init__(self, config: ProductConfig, init_chroma: bool = True):
                         )
             self.aqa_response_buffer = ""
 
-        if self.config.models.language_model == "models/gemini-pro":
-            self.gemini = Gemini(models_config=config.models)
-            self.context_model = "models/gemini-pro"
+        if self.config.models.language_model.startswith("models/gemini"):
+            self.gemini = Gemini(
+                models_config=config.models, conditions=config.conditions
+            )
+            self.context_model = self.config.models.language_model
+
+        # Always initialize the gemini-pro model for other tasks.
+        gemini_pro_model_config = Models(
+            language_model="models/gemini-pro",
+            embedding_model=self.embedding_model,
+            api_endpoint=self.api_endpoint,
+        )
+        self.gemini_pro = Gemini(
+            models_config=gemini_pro_model_config, conditions=config.conditions
+        )
+
+        if self.config.app_mode == "1.5":
+            # Initialize the gemini-1.5.pro model for summarization.
+            gemini_15_model_config = Models(
+                language_model="models/gemini-1.5-pro-latest",
+                embedding_model=self.embedding_model,
+                api_endpoint=self.api_endpoint,
+            )
+            self.gemini_15 = Gemini(
+                models_config=gemini_15_model_config, conditions=config.conditions
+            )
+        else:
+            self.gemini_15 = self.gemini_pro
 
     # Use this method for talking to a Gemini content model
     def ask_content_model_with_context(self, context, question):
@@ -261,7 +286,7 @@ def ask_aqa_model_using_corpora(self, question, answer_style: str = "VERBOSE"):
 
     def ask_aqa_model(self, question):
         response = ""
-        if self.db_type == "ONLINE_STORAGE":
+        if self.config.db_type == "google_semantic_retriever":
             response = self.ask_aqa_model_using_corpora(question)
         else:
             response = self.ask_aqa_model_using_local_vector_store(question)
@@ -436,7 +461,11 @@ def query_vector_store_to_build(
     # If prompt is "fact_checker" it will use the fact_check_question from
     # config.yaml for the prompt
     def ask_content_model_with_context_prompt(
-        self, context: str, question: str, prompt: str = None
+        self,
+        context: str,
+        question: str,
+        prompt: typing.Optional[str] = None,
+        model: typing.Optional[str] = None,
     ):
         if prompt == None:
             prompt = self.config.conditions.condition_text
@@ -447,7 +476,13 @@ def ask_content_model_with_context_prompt(
         if self.config.log_level == "VERBOSE":
             self.print_the_prompt(new_prompt)
         try:
-            response = self.gemini.generate_content(contents=new_prompt)
+            response = ""
+            if model == "gemini-pro":
+                response = self.gemini_pro.generate_content(contents=new_prompt)
+            elif model == "gemini-1.5-pro":
+                response = self.gemini_15.generate_content(contents=new_prompt)
+            else:
+                response = self.gemini.generate_content(contents=new_prompt)
         except:
             return self.config.conditions.model_error_message, new_prompt
         for chunk in response:
@@ -475,7 +510,7 @@ def ask_content_model_to_use_file(self, prompt: str, file: str):
     # Use this method for asking a Gemini content model for fact-checking.
     # This uses ask_content_model_with_context_prompt w
     def ask_content_model_to_fact_check_prompt(self, context: str, prev_response: str):
-        question = self.fact_check_question + "\n\nText: "
+        question = self.config.conditions.fact_check_question + "\n\nText: "
         question += prev_response
         return self.ask_content_model_with_context_prompt(
             context=context, question=question, prompt=""
@@ -487,7 +522,7 @@ def generate_embedding(self, text, task_type: str = "SEMANTIC_SIMILARITY"):
 
 
 # Function to give an embedding function for gemini using an API key
-def embedding_function_gemini_retrieval(api_key):
+def embedding_function_gemini_retrieval(api_key, embedding_model: str):
     return embedding_functions.GoogleGenerativeAiEmbeddingFunction(
-        api_key=api_key, model_name="models/embedding-001", task_type="RETRIEVAL_QUERY"
+        api_key=api_key, model_name=embedding_model, task_type="RETRIEVAL_QUERY"
     )