kyolee415 · dependabot · Oct 27, 2023 · Nov 3, 2023 · Nov 3, 2023 · Nov 3, 2023
diff --git a/demos/palm/python/docs-agent/README.md b/demos/palm/python/docs-agent/README.md
@@ -1,7 +1,7 @@
 # Docs Agent
 
-The Docs Agent project enables [PaLM API][genai-doc-site] users to launch a chat application
-on a Linux host machine using their documents as a dataset.
+The Docs Agent project enables [Gemini API][genai-doc-site] (previously PaLM API) users to
+launch a chat application on a Linux host machine using their documents as a dataset.
 
 **Note**: If you want to set up and launch the Docs Agent sample app on your host machine,
 check out the [Set up Docs Agent][set-up-docs-agent] section below.
@@ -15,16 +15,16 @@ can be from various sources such as Markdown, HTML, Google Docs, Gmail, PDF, etc
 
 The main goal of the Docs Agent project is:
 
-- You can supply your own set of documents to enable a PaLM 2 model to generate useful,
+- You can supply your own set of documents to enable Google AI models to generate useful,
   relevant, and accurate responses that are grounded on the documented information.
 
 The Docs Agent sample app is designed to be easily set up and configured in a Linux environment
-and is required that you have access to Google’s [PaLM API][genai-doc-site].
+and is required that you have access to Google’s [Gemini API][genai-doc-site].
 
 Keep in mind that this approach does not involve “fine-tuning” an LLM (large language model).
 Instead, the Docs Agent sample app uses a mixture of prompt engineering and embedding techniques,
 also known as Retrieval Augmented Generation (RAG), on top of a publicly available LLM model
-like PaLM 2.
+like Gemini Pro.
 
 ![Docs Agent architecture](docs/images/docs-agent-architecture-01.png)
 
@@ -46,7 +46,7 @@ easy to process Markdown files into embeddings. However, there is no hard requir
 source documents must exist in Markdown format. What’s important is that the processed content
 is available as embeddings in the vector database.
 
-### Structure of a prompt to a PaLM 2 model
+### Structure of a prompt to a language model
 
 To enable an LLM to answer questions that are not part of the public knowledge (which the LLM
 is likely trained on), the Docs Agent project applies a mixture of prompt engineering and
@@ -59,7 +59,7 @@ Once the most relevant content is returned, the Docs Agent server uses the promp
 shown in Figure 3 to augment the user question with a preset **condition** and a list of
 **context**. (When the Docs Agent server starts, the condition value is read from the
 [`config.yaml`][config-yaml] file.) Then the Docs Agent server sends this prompt to a
-PaLM 2 model using the PaLM API and receives a response generated by the model.
+language model using the Gemini API and receives a response generated by the model.
 
 ![Docs Agent prompt strcture](docs/images/docs-agent-prompt-structure-01.png)
 
@@ -108,15 +108,15 @@ The following list summarizes the tasks and features of the Docs Agent sample ap
   relevant content given user questions (which are also processed into embeddings using
   the same `embedding-gecko-001` model).
 - **Add context to a user question in a prompt**: Add the list of content returned from
-  the semantic search as context to the user question and send the prompt to a PaLM 2
-  model using the PaLM API.
+  the semantic search as context to the user question and send the prompt to a language
+  model using the Gemini API.
 - **(Experimental) “Fact-check” responses**: This experimental feature composes a
-  follow-up prompt and asks the PaLM 2 model to “fact-check” its own previous response.
-  (See the [Using a PaLM 2 model to fact-check its own response][fact-check-section] section.)
+  follow-up prompt and asks the language model to “fact-check” its own previous response.
+  (See the [Using a language model to fact-check its own response][fact-check-section] section.)
 - **Generate 5 related questions**: In addition to displaying a response to the user
-  question, the web UI displays five questions generated by the PaLM 2 model based on
+  question, the web UI displays five questions generated by the language model based on
   the context of the user question. (See the
-  [Using a PaLM 2 model to suggest related questions][related-questions-section] section.)
+  [Using a language model to suggest related questions][related-questions-section] section.)
 - **Display URLs of knowledge sources**: The vector database stores URLs as metadata for
   embeddings. Whenever the vector database is used to retrieve context (for instance, to
   provide context to user questions), the database can also return the URLs of the sources
@@ -150,16 +150,16 @@ The following events take place in the Docs Agent sample app:
    text chunks that are most relevant to the user question.
 6. The Docs Agent server adds this list of text chunks as context (plus a condition
    for responses) to the user question and constructs them into a prompt.
-7. The system sends the prompt to a PaLM 2 model via the PaLM API.
-8. The PaLM 2 model generates a response and the Docs Agent server renders it on
+7. The system sends the prompt to a language model via the Gemini API.
+8. The language model generates a response and the Docs Agent server renders it on
    the chat UI.
 
 Additional events for [“fact-checking” a generated response][fact-check-section]:
 
 9. The Docs Agent server prepares another prompt that compares the generated response
-   (in step 8) to the context (in step 6) and asks the PaLM model to look for
+   (in step 8) to the context (in step 6) and asks the language model to look for
    a discrepancy in the response.
-10. The PaLM model generates a response that points out one major discrepancy
+10. The language model generates a response that points out one major discrepancy
     (if it exists) between its previous response and the context.
 11. The Docs Agent server renders this response on the chat UI as a call-out note.
 12. The Docs Agent server passes this second response to the vector database to
@@ -172,9 +172,9 @@ Additional events for [“fact-checking” a generated response][fact-check-sect
 Additional events for
 [suggesting 5 questions related to the user question][related-questions-section]:
 
-15. The Docs Agent server prepares another prompt that asks the PaLM model to
+15. The Docs Agent server prepares another prompt that asks the language model to
     generate 5 questions based on the context (in step 6).
-16. The PaLM model generates a response that contains a list of questions related
+16. The language model generates a response that contains a list of questions related
     to the context.
 17. The Docs Agent server renders the questions on the chat UI.
 
@@ -188,11 +188,11 @@ enhancing the usability of the Q&A experience powered by generative AI.
 **Figure 6**. A screenshot of the Docs Agent chat UI showing the sections generated by
 three distinct prompts.
 
-### Using a PaLM 2 model to fact-check its own response
+### Using a language model to fact-check its own response
 
 In addition to using the prompt structure above (shown in Figure 3), we‘re currently
 experimenting with the following prompt setup for “fact-checking” responses generated
-by the PaLM model:
+by the language model:
 
 - Condition:
 
@@ -247,18 +247,18 @@ database. Once the vector database returns a list of the most relevant content t
 the UI only displays the top URL to the user.
 
 Keep in mind that this "fact-checking" prompt setup is currently considered **experimental**
-because we‘ve seen cases where a PaLM model would end up adding incorrect information into its
+because we‘ve seen cases where a language model would end up adding incorrect information into its
 second response as well. However, we saw that adding this second response (which brings attention
-to the PaLM model’s possible hallucinations) seems to improve the usability of the system since it
-serves as a reminder to the users that the PaLM model‘s response is far from being perfect, which
+to the language model’s possible hallucinations) seems to improve the usability of the system since it
+serves as a reminder to the users that the language model‘s response is far from being perfect, which
 helps encourage the users to take more steps to validate generated responses for themselves.
 
-### Using a PaLM 2 model to suggest related questions
+### Using a language model to suggest related questions
 
 The project‘s latest web UI includes the “Related questions” section, which displays five
 questions that are related to the user question (see Figure 6). These five questions are also
-generated by a PaLM model (via the PaLM API). Using the list of contents returned from the vector
-database as context, the system prepares another prompt asking the PaLM model to generate five
+generated by a language model (via the Gemini API). Using the list of contents returned from the vector
+database as context, the system prepares another prompt asking the language model to generate five
 questions from the included context.
 
 The following is the exact structure of this prompt:
@@ -364,7 +364,7 @@ This section provides instructions on how to set up the Docs Agent project on a
 
    This is a [known issue][poetry-known-issue] in `poetry`.
 
-5. Set the PaLM API key as a environment variable:
+5. Set the Gemini API key as a environment variable:
 
    ```
    export PALM_API_KEY=<YOUR_API_KEY_HERE>
@@ -603,8 +603,8 @@ To launch the Docs Agent chat app, do the following:
    already running on port 5000 on your host machine, you can use the `-p` flag to specify
    a different port (for example, `poetry run ./chatbot/launch.sh -p 5050`).
 
-   **Note**: If this `poetry run ./chatbot/launch.sh` command fails to run, check the `HOSTNAME` environment 
-   variable on your host machine (for example, `echo $HOSTNAME`). If this variable is unset, try setting it to 
+   **Note**: If this `poetry run ./chatbot/launch.sh` command fails to run, check the `HOSTNAME` environment
+   variable on your host machine (for example, `echo $HOSTNAME`). If this variable is unset, try setting it to
    `localhost` by running `export HOSTNAME=localhost`.
 
    Once the app starts running, this command prints output similar to the following:
@@ -659,14 +659,14 @@ Meggin Kearney (`@Meggin`), and Kyo Lee (`@kyolee415`).
 [markdown-to-plain-text]: ./scripts/markdown_to_plain_text.py
 [populate-vector-database]: ./scripts/populate_vector_database.py
 [context-source-01]: http://eventhorizontelescope.org
-[fact-check-section]: #using-a-palm-2-model-to-fact-check-its-own-response
-[related-questions-section]: #using-a-palm-2-model-to-suggest-related-questions
+[fact-check-section]: #using-a-language-model-to-fact-check-its-own-response
+[related-questions-section]: #using-a-language-model-to-suggest-related-questions
 [submit-a-rewrite]: #enabling-users-to-submit-a-rewrite-of-a-generated-response
 [like-generate-responses]: #enabling-users-to-like-generated-responses
 [populate-db-steps]: #populate-a-new-vector-database-from-markdown-files
 [start-the-app-steps]: #start-the-docs-agent-chat-app
 [launch-script]: ./chatbot/launch.sh
-[genai-doc-site]: https://developers.generativeai.google/products/palm
+[genai-doc-site]: https://ai.google.dev/docs
 [chroma-docs]: https://docs.trychroma.com/
 [flutter-docs-src]: https://github.com/flutter/website/tree/main/src
 [flutter-docs-site]: https://docs.flutter.dev/

diff --git a/demos/palm/python/docs-agent/chatbot/chatui.py b/demos/palm/python/docs-agent/chatbot/chatui.py
@@ -146,14 +146,24 @@ def ask_model(question):
     query_result = docs_agent.query_vector_store(question)
     context = query_result.fetch_formatted(Format.CONTEXT)
     context_with_instruction = docs_agent.add_instruction_to_context(context)
-    response = docs_agent.ask_text_model_with_context(
-        context_with_instruction, question
-    )
+    if "gemini" in docs_agent.get_language_model_name():
+        response = docs_agent.ask_content_model_with_context(
+            context_with_instruction, question
+        )
+    else:
+        response = docs_agent.ask_text_model_with_context(
+            context_with_instruction, question
+        )
 
     ### PROMPT 2: FACT-CHECK THE PREVIOUS RESPONSE.
-    fact_checked_response = docs_agent.ask_text_model_to_fact_check(
-        context_with_instruction, response
-    )
+    if "gemini" in docs_agent.get_language_model_name():
+        fact_checked_response = docs_agent.ask_content_model_to_fact_check(
+            context_with_instruction, response
+        )
+    else:
+        fact_checked_response = docs_agent.ask_text_model_to_fact_check(
+            context_with_instruction, response
+        )
 
     ### PROMPT 3: GET 5 RELATED QUESTIONS.
     # 1. Use the response from Prompt 1 as context and add a custom condition.

diff --git a/demos/palm/python/docs-agent/chatbot/templates/chatui/result.html b/demos/palm/python/docs-agent/chatbot/templates/chatui/result.html
@@ -3,7 +3,7 @@ <h2>Question</h2>
   <p>{{ question | replace("+", " ") | replace("%3F", "?")}}</p>
 </div>
 <div class="response-text" id="response-box">
-  <h2>PaLM's answer</h2>
+  <h2>Generated answer</h2>
   <span id="palm-response">
     {{ response_in_html | safe }}
   </span>

diff --git a/demos/palm/python/docs-agent/chroma.py b/demos/palm/python/docs-agent/chroma.py
@@ -76,7 +76,6 @@ def get_collection(self, name, embedding_function=None, embedding_model=None):
                 )
             )
         else:
-            print("Embedding model: " + str(embedding_model))
             try:
                 palm = PaLM(embed_model=embedding_model, find_models=False)
                 # We cannot redefine embedding_function with def and

diff --git a/demos/palm/python/docs-agent/config.yaml b/demos/palm/python/docs-agent/config.yaml
@@ -23,7 +23,8 @@
 # embedding_model: The PaLM embedding model used to generate embeddings.
 #
 api_endpoint: "generativelanguage.googleapis.com"
-embedding_model: "models/embedding-gecko-001"
+language_model: "models/gemini-pro"
+embedding_model: "models/embedding-001"
 
 
 ### Docs Agent environment

diff --git a/demos/palm/python/docs-agent/docs_agent.py b/demos/palm/python/docs-agent/docs_agent.py
@@ -34,6 +34,7 @@
 
 # Select your PaLM API endpoint.
 PALM_API_ENDPOINT = "generativelanguage.googleapis.com"
+LANGUAGE_MODEL = None
 EMBEDDING_MODEL = None
 
 # Set up the path to the chroma vector database.
@@ -54,13 +55,34 @@
     MODEL_ERROR_MESSAGE = config_values.returnConfigValue("model_error_message")
     LOG_LEVEL = config_values.returnConfigValue("log_level")
     PALM_API_ENDPOINT = config_values.returnConfigValue("api_endpoint")
+    LANGUAGE_MODEL = config_values.returnConfigValue("language_model")
     EMBEDDING_MODEL = config_values.returnConfigValue("embedding_model")
 
 # Select the number of contents to be used for providing context.
 NUM_RETURNS = 5
 
 # Initialize the PaLM instance.
-palm = PaLM(api_key=API_KEY, api_endpoint=PALM_API_ENDPOINT)
+if LANGUAGE_MODEL != None and EMBEDDING_MODEL != None:
+    if "gemini" in LANGUAGE_MODEL:
+        palm = PaLM(
+            api_key=API_KEY,
+            api_endpoint=PALM_API_ENDPOINT,
+            content_model=LANGUAGE_MODEL,
+            embed_model=EMBEDDING_MODEL,
+        )
+    else:
+        palm = PaLM(
+            api_key=API_KEY,
+            api_endpoint=PALM_API_ENDPOINT,
+            text_model=LANGUAGE_MODEL,
+            embed_model=EMBEDDING_MODEL,
+        )
+elif EMBEDDING_MODEL != None:
+    palm = PaLM(
+        api_key=API_KEY, api_endpoint=PALM_API_ENDPOINT, embed_model=EMBEDDING_MODEL
+    )
+else:
+    palm = PaLM(api_key=API_KEY, api_endpoint=PALM_API_ENDPOINT)
 
 
 class DocsAgent:
@@ -79,8 +101,11 @@ def __init__(self):
         self.prompt_condition = CONDITION_TEXT
         self.fact_check_question = FACT_CHECK_QUESTION
         self.model_error_message = MODEL_ERROR_MESSAGE
+        # Models settings
+        self.language_model = LANGUAGE_MODEL
+        self.embedding_model = EMBEDDING_MODEL
 
-    # Use this method for talking to PaLM (Text)
+    # Use this method for talking to a PaLM text model
     def ask_text_model_with_context(self, context, question):
         new_prompt = f"{context}\n\nQuestion: {question}"
         # Print the prompt for debugging if the log level is VERBOSE.
@@ -101,7 +126,22 @@ def ask_text_model_with_context(self, context, question):
             return self.model_error_message
         return response.result
 
-    # Use this method for talking to PaLM (Chat)
+    # Use this method for talking to a Gemini content model
+    def ask_content_model_with_context(self, context, question):
+        new_prompt = context + "\n\nQuestion: " + question
+        # Print the prompt for debugging if the log level is VERBOSE.
+        if LOG_LEVEL == "VERBOSE":
+            self.print_the_prompt(new_prompt)
+        try:
+            response = palm.generate_content(new_prompt)
+        except google.api_core.exceptions.InvalidArgument:
+            return self.model_error_message
+        for chunk in response:
+            if str(chunk.candidates[0].content) == "":
+                return self.model_error_message
+        return response.text
+
+    # Use this method for talking to a PaLM chat model
     def ask_chat_model_with_context(self, context, question):
         try:
             response = palm.chat(
@@ -116,12 +156,18 @@ def ask_chat_model_with_context(self, context, question):
             return self.model_error_message
         return response.last
 
-    # Use this method for asking PaLM (Text) for fact-checking
+    # Use this method for asking a PaLM text model for fact-checking
     def ask_text_model_to_fact_check(self, context, prev_response):
         question = self.fact_check_question + "\n\nText: "
         question += prev_response
         return self.ask_text_model_with_context(context, question)
 
+    # Use this method for asking a Gemini content model for fact-checking
+    def ask_content_model_to_fact_check(self, context, prev_response):
+        question = self.fact_check_question + "\n\nText: "
+        question += prev_response
+        return self.ask_content_model_with_context(context, question)
+
     # Query the local Chroma vector database using the user question
     def query_vector_store(self, question):
         return self.collection.query(question, NUM_RETURNS)
@@ -142,6 +188,14 @@ def add_custom_instruction_to_context(self, condition, context):
     def generate_embedding(self, text):
         return palm.embed(text)
 
+    # Get the name of the language model used in this Docs Agent setup
+    def get_language_model_name(self):
+        return self.language_model
+
+    # Get the name of the embedding model used in this Docs Agent setup
+    def get_embedding_model_name(self):
+        return self.embedding_model
+
     # Print the prompt on the terminal for debugging
     def print_the_prompt(self, prompt):
         print("#########################################")
-Original file line number
+Diff line change
@@ Expand Up @@
                     )
                 )
             else:
-                print("Embedding model: " + str(embedding_model))
                 try:
                     palm = PaLM(embed_model=embedding_model, find_models=False)
                     # We cannot redefine embedding_function with def and
@@ Expand Down @@