Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump jinja2 from 3.1.2 to 3.1.3 in /demos/palm/python/docs-agent #25

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 33 additions & 33 deletions demos/palm/python/docs-agent/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Docs Agent

The Docs Agent project enables [PaLM API][genai-doc-site] users to launch a chat application
on a Linux host machine using their documents as a dataset.
The Docs Agent project enables [Gemini API][genai-doc-site] (previously PaLM API) users to
launch a chat application on a Linux host machine using their documents as a dataset.

**Note**: If you want to set up and launch the Docs Agent sample app on your host machine,
check out the [Set up Docs Agent][set-up-docs-agent] section below.
Expand All @@ -15,16 +15,16 @@ can be from various sources such as Markdown, HTML, Google Docs, Gmail, PDF, etc

The main goal of the Docs Agent project is:

- You can supply your own set of documents to enable a PaLM 2 model to generate useful,
- You can supply your own set of documents to enable Google AI models to generate useful,
relevant, and accurate responses that are grounded on the documented information.

The Docs Agent sample app is designed to be easily set up and configured in a Linux environment
and is required that you have access to Google’s [PaLM API][genai-doc-site].
and is required that you have access to Google’s [Gemini API][genai-doc-site].

Keep in mind that this approach does not involve “fine-tuning” an LLM (large language model).
Instead, the Docs Agent sample app uses a mixture of prompt engineering and embedding techniques,
also known as Retrieval Augmented Generation (RAG), on top of a publicly available LLM model
like PaLM 2.
like Gemini Pro.

![Docs Agent architecture](docs/images/docs-agent-architecture-01.png)

Expand All @@ -46,7 +46,7 @@ easy to process Markdown files into embeddings. However, there is no hard requir
source documents must exist in Markdown format. What’s important is that the processed content
is available as embeddings in the vector database.

### Structure of a prompt to a PaLM 2 model
### Structure of a prompt to a language model

To enable an LLM to answer questions that are not part of the public knowledge (which the LLM
is likely trained on), the Docs Agent project applies a mixture of prompt engineering and
Expand All @@ -59,7 +59,7 @@ Once the most relevant content is returned, the Docs Agent server uses the promp
shown in Figure 3 to augment the user question with a preset **condition** and a list of
**context**. (When the Docs Agent server starts, the condition value is read from the
[`config.yaml`][config-yaml] file.) Then the Docs Agent server sends this prompt to a
PaLM 2 model using the PaLM API and receives a response generated by the model.
language model using the Gemini API and receives a response generated by the model.

![Docs Agent prompt strcture](docs/images/docs-agent-prompt-structure-01.png)

Expand Down Expand Up @@ -108,15 +108,15 @@ The following list summarizes the tasks and features of the Docs Agent sample ap
relevant content given user questions (which are also processed into embeddings using
the same `embedding-gecko-001` model).
- **Add context to a user question in a prompt**: Add the list of content returned from
the semantic search as context to the user question and send the prompt to a PaLM 2
model using the PaLM API.
the semantic search as context to the user question and send the prompt to a language
model using the Gemini API.
- **(Experimental) “Fact-check” responses**: This experimental feature composes a
follow-up prompt and asks the PaLM 2 model to “fact-check” its own previous response.
(See the [Using a PaLM 2 model to fact-check its own response][fact-check-section] section.)
follow-up prompt and asks the language model to “fact-check” its own previous response.
(See the [Using a language model to fact-check its own response][fact-check-section] section.)
- **Generate 5 related questions**: In addition to displaying a response to the user
question, the web UI displays five questions generated by the PaLM 2 model based on
question, the web UI displays five questions generated by the language model based on
the context of the user question. (See the
[Using a PaLM 2 model to suggest related questions][related-questions-section] section.)
[Using a language model to suggest related questions][related-questions-section] section.)
- **Display URLs of knowledge sources**: The vector database stores URLs as metadata for
embeddings. Whenever the vector database is used to retrieve context (for instance, to
provide context to user questions), the database can also return the URLs of the sources
Expand Down Expand Up @@ -150,16 +150,16 @@ The following events take place in the Docs Agent sample app:
text chunks that are most relevant to the user question.
6. The Docs Agent server adds this list of text chunks as context (plus a condition
for responses) to the user question and constructs them into a prompt.
7. The system sends the prompt to a PaLM 2 model via the PaLM API.
8. The PaLM 2 model generates a response and the Docs Agent server renders it on
7. The system sends the prompt to a language model via the Gemini API.
8. The language model generates a response and the Docs Agent server renders it on
the chat UI.

Additional events for [“fact-checking” a generated response][fact-check-section]:

9. The Docs Agent server prepares another prompt that compares the generated response
(in step 8) to the context (in step 6) and asks the PaLM model to look for
(in step 8) to the context (in step 6) and asks the language model to look for
a discrepancy in the response.
10. The PaLM model generates a response that points out one major discrepancy
10. The language model generates a response that points out one major discrepancy
(if it exists) between its previous response and the context.
11. The Docs Agent server renders this response on the chat UI as a call-out note.
12. The Docs Agent server passes this second response to the vector database to
Expand All @@ -172,9 +172,9 @@ Additional events for [“fact-checking” a generated response][fact-check-sect
Additional events for
[suggesting 5 questions related to the user question][related-questions-section]:

15. The Docs Agent server prepares another prompt that asks the PaLM model to
15. The Docs Agent server prepares another prompt that asks the language model to
generate 5 questions based on the context (in step 6).
16. The PaLM model generates a response that contains a list of questions related
16. The language model generates a response that contains a list of questions related
to the context.
17. The Docs Agent server renders the questions on the chat UI.

Expand All @@ -188,11 +188,11 @@ enhancing the usability of the Q&A experience powered by generative AI.
**Figure 6**. A screenshot of the Docs Agent chat UI showing the sections generated by
three distinct prompts.

### Using a PaLM 2 model to fact-check its own response
### Using a language model to fact-check its own response

In addition to using the prompt structure above (shown in Figure 3), we‘re currently
experimenting with the following prompt setup for “fact-checking” responses generated
by the PaLM model:
by the language model:

- Condition:

Expand Down Expand Up @@ -247,18 +247,18 @@ database. Once the vector database returns a list of the most relevant content t
the UI only displays the top URL to the user.

Keep in mind that this "fact-checking" prompt setup is currently considered **experimental**
because we‘ve seen cases where a PaLM model would end up adding incorrect information into its
because we‘ve seen cases where a language model would end up adding incorrect information into its
second response as well. However, we saw that adding this second response (which brings attention
to the PaLM model’s possible hallucinations) seems to improve the usability of the system since it
serves as a reminder to the users that the PaLM model‘s response is far from being perfect, which
to the language model’s possible hallucinations) seems to improve the usability of the system since it
serves as a reminder to the users that the language model‘s response is far from being perfect, which
helps encourage the users to take more steps to validate generated responses for themselves.

### Using a PaLM 2 model to suggest related questions
### Using a language model to suggest related questions

The project‘s latest web UI includes the “Related questions” section, which displays five
questions that are related to the user question (see Figure 6). These five questions are also
generated by a PaLM model (via the PaLM API). Using the list of contents returned from the vector
database as context, the system prepares another prompt asking the PaLM model to generate five
generated by a language model (via the Gemini API). Using the list of contents returned from the vector
database as context, the system prepares another prompt asking the language model to generate five
questions from the included context.

The following is the exact structure of this prompt:
Expand Down Expand Up @@ -364,7 +364,7 @@ This section provides instructions on how to set up the Docs Agent project on a

This is a [known issue][poetry-known-issue] in `poetry`.

5. Set the PaLM API key as a environment variable:
5. Set the Gemini API key as a environment variable:

```
export PALM_API_KEY=<YOUR_API_KEY_HERE>
Expand Down Expand Up @@ -603,8 +603,8 @@ To launch the Docs Agent chat app, do the following:
already running on port 5000 on your host machine, you can use the `-p` flag to specify
a different port (for example, `poetry run ./chatbot/launch.sh -p 5050`).

**Note**: If this `poetry run ./chatbot/launch.sh` command fails to run, check the `HOSTNAME` environment
variable on your host machine (for example, `echo $HOSTNAME`). If this variable is unset, try setting it to
**Note**: If this `poetry run ./chatbot/launch.sh` command fails to run, check the `HOSTNAME` environment
variable on your host machine (for example, `echo $HOSTNAME`). If this variable is unset, try setting it to
`localhost` by running `export HOSTNAME=localhost`.

Once the app starts running, this command prints output similar to the following:
Expand Down Expand Up @@ -659,14 +659,14 @@ Meggin Kearney (`@Meggin`), and Kyo Lee (`@kyolee415`).
[markdown-to-plain-text]: ./scripts/markdown_to_plain_text.py
[populate-vector-database]: ./scripts/populate_vector_database.py
[context-source-01]: http://eventhorizontelescope.org
[fact-check-section]: #using-a-palm-2-model-to-fact-check-its-own-response
[related-questions-section]: #using-a-palm-2-model-to-suggest-related-questions
[fact-check-section]: #using-a-language-model-to-fact-check-its-own-response
[related-questions-section]: #using-a-language-model-to-suggest-related-questions
[submit-a-rewrite]: #enabling-users-to-submit-a-rewrite-of-a-generated-response
[like-generate-responses]: #enabling-users-to-like-generated-responses
[populate-db-steps]: #populate-a-new-vector-database-from-markdown-files
[start-the-app-steps]: #start-the-docs-agent-chat-app
[launch-script]: ./chatbot/launch.sh
[genai-doc-site]: https://developers.generativeai.google/products/palm
[genai-doc-site]: https://ai.google.dev/docs
[chroma-docs]: https://docs.trychroma.com/
[flutter-docs-src]: https://github.com/flutter/website/tree/main/src
[flutter-docs-site]: https://docs.flutter.dev/
Expand Down
22 changes: 16 additions & 6 deletions demos/palm/python/docs-agent/chatbot/chatui.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,14 +146,24 @@ def ask_model(question):
query_result = docs_agent.query_vector_store(question)
context = query_result.fetch_formatted(Format.CONTEXT)
context_with_instruction = docs_agent.add_instruction_to_context(context)
response = docs_agent.ask_text_model_with_context(
context_with_instruction, question
)
if "gemini" in docs_agent.get_language_model_name():
response = docs_agent.ask_content_model_with_context(
context_with_instruction, question
)
else:
response = docs_agent.ask_text_model_with_context(
context_with_instruction, question
)

### PROMPT 2: FACT-CHECK THE PREVIOUS RESPONSE.
fact_checked_response = docs_agent.ask_text_model_to_fact_check(
context_with_instruction, response
)
if "gemini" in docs_agent.get_language_model_name():
fact_checked_response = docs_agent.ask_content_model_to_fact_check(
context_with_instruction, response
)
else:
fact_checked_response = docs_agent.ask_text_model_to_fact_check(
context_with_instruction, response
)

### PROMPT 3: GET 5 RELATED QUESTIONS.
# 1. Use the response from Prompt 1 as context and add a custom condition.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ <h2>Question</h2>
<p>{{ question | replace("+", " ") | replace("%3F", "?")}}</p>
</div>
<div class="response-text" id="response-box">
<h2>PaLM's answer</h2>
<h2>Generated answer</h2>
<span id="palm-response">
{{ response_in_html | safe }}
</span>
Expand Down
1 change: 0 additions & 1 deletion demos/palm/python/docs-agent/chroma.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,6 @@ def get_collection(self, name, embedding_function=None, embedding_model=None):
)
)
else:
print("Embedding model: " + str(embedding_model))
try:
palm = PaLM(embed_model=embedding_model, find_models=False)
# We cannot redefine embedding_function with def and
Expand Down
3 changes: 2 additions & 1 deletion demos/palm/python/docs-agent/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@
# embedding_model: The PaLM embedding model used to generate embeddings.
#
api_endpoint: "generativelanguage.googleapis.com"
embedding_model: "models/embedding-gecko-001"
language_model: "models/gemini-pro"
embedding_model: "models/embedding-001"


### Docs Agent environment
Expand Down
62 changes: 58 additions & 4 deletions demos/palm/python/docs-agent/docs_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@

# Select your PaLM API endpoint.
PALM_API_ENDPOINT = "generativelanguage.googleapis.com"
LANGUAGE_MODEL = None
EMBEDDING_MODEL = None

# Set up the path to the chroma vector database.
Expand All @@ -54,13 +55,34 @@
MODEL_ERROR_MESSAGE = config_values.returnConfigValue("model_error_message")
LOG_LEVEL = config_values.returnConfigValue("log_level")
PALM_API_ENDPOINT = config_values.returnConfigValue("api_endpoint")
LANGUAGE_MODEL = config_values.returnConfigValue("language_model")
EMBEDDING_MODEL = config_values.returnConfigValue("embedding_model")

# Select the number of contents to be used for providing context.
NUM_RETURNS = 5

# Initialize the PaLM instance.
palm = PaLM(api_key=API_KEY, api_endpoint=PALM_API_ENDPOINT)
if LANGUAGE_MODEL != None and EMBEDDING_MODEL != None:
if "gemini" in LANGUAGE_MODEL:
palm = PaLM(
api_key=API_KEY,
api_endpoint=PALM_API_ENDPOINT,
content_model=LANGUAGE_MODEL,
embed_model=EMBEDDING_MODEL,
)
else:
palm = PaLM(
api_key=API_KEY,
api_endpoint=PALM_API_ENDPOINT,
text_model=LANGUAGE_MODEL,
embed_model=EMBEDDING_MODEL,
)
elif EMBEDDING_MODEL != None:
palm = PaLM(
api_key=API_KEY, api_endpoint=PALM_API_ENDPOINT, embed_model=EMBEDDING_MODEL
)
else:
palm = PaLM(api_key=API_KEY, api_endpoint=PALM_API_ENDPOINT)


class DocsAgent:
Expand All @@ -79,8 +101,11 @@ def __init__(self):
self.prompt_condition = CONDITION_TEXT
self.fact_check_question = FACT_CHECK_QUESTION
self.model_error_message = MODEL_ERROR_MESSAGE
# Models settings
self.language_model = LANGUAGE_MODEL
self.embedding_model = EMBEDDING_MODEL

# Use this method for talking to PaLM (Text)
# Use this method for talking to a PaLM text model
def ask_text_model_with_context(self, context, question):
new_prompt = f"{context}\n\nQuestion: {question}"
# Print the prompt for debugging if the log level is VERBOSE.
Expand All @@ -101,7 +126,22 @@ def ask_text_model_with_context(self, context, question):
return self.model_error_message
return response.result

# Use this method for talking to PaLM (Chat)
# Use this method for talking to a Gemini content model
def ask_content_model_with_context(self, context, question):
new_prompt = context + "\n\nQuestion: " + question
# Print the prompt for debugging if the log level is VERBOSE.
if LOG_LEVEL == "VERBOSE":
self.print_the_prompt(new_prompt)
try:
response = palm.generate_content(new_prompt)
except google.api_core.exceptions.InvalidArgument:
return self.model_error_message
for chunk in response:
if str(chunk.candidates[0].content) == "":
return self.model_error_message
return response.text

# Use this method for talking to a PaLM chat model
def ask_chat_model_with_context(self, context, question):
try:
response = palm.chat(
Expand All @@ -116,12 +156,18 @@ def ask_chat_model_with_context(self, context, question):
return self.model_error_message
return response.last

# Use this method for asking PaLM (Text) for fact-checking
# Use this method for asking a PaLM text model for fact-checking
def ask_text_model_to_fact_check(self, context, prev_response):
question = self.fact_check_question + "\n\nText: "
question += prev_response
return self.ask_text_model_with_context(context, question)

# Use this method for asking a Gemini content model for fact-checking
def ask_content_model_to_fact_check(self, context, prev_response):
question = self.fact_check_question + "\n\nText: "
question += prev_response
return self.ask_content_model_with_context(context, question)

# Query the local Chroma vector database using the user question
def query_vector_store(self, question):
return self.collection.query(question, NUM_RETURNS)
Expand All @@ -142,6 +188,14 @@ def add_custom_instruction_to_context(self, condition, context):
def generate_embedding(self, text):
return palm.embed(text)

# Get the name of the language model used in this Docs Agent setup
def get_language_model_name(self):
return self.language_model

# Get the name of the embedding model used in this Docs Agent setup
def get_embedding_model_name(self):
return self.embedding_model

# Print the prompt on the terminal for debugging
def print_the_prompt(self, prompt):
print("#########################################")
Expand Down
Loading