Multimodal feature for confluence (image summaries only) #3208

nisi99 · 2024-11-22T11:01:11Z

Description

Adds multimodal functionality to the Confluence connector: extracts all images that are in the attachment of a confluence page (and used on the page) and uses a (multimodal) LLM to create textual descriptions/summaries for each image.
To improve the summary the XML of each confluence page is passed to the LLM as well as the title of the page and the name of the image. Each summary is stored as a new document, with the summary as the text of the section and the page of origin as link.
Additional a label that marks the content of the document as a summary (label: is_image_summary) is created to enable users to directly recognise summaries as such.

With this approach, the first multimodal functions can be integrated into the Confluence connector without having to fundamentally adapt the retrieval or general structure of documents in Danswer.

Multimodality can be activated using the environment variable CONFLUENCE_IMAGE_SUMMARIZATION_MULTIMODAL_ANSWERING.
If not used or false nothing will change, meaning no summaries will be generated and used for the answers.
If CONFLUENCE_IMAGE_SUMMARIZATION_MULTIMODAL_ANSWERING is set false after indexing with summaries all summaries will be ignored in retrieval.
Optionally users can set a custom system prompt (CONFLUENCE_IMAGE_SUMMARIZATION_SYSTEM_PROMPT) and user prompt (*

CONFLUENCE_IMAGE_SUMMARIZATION_USER_PROMPT*) as environment variables for the summarization. If not specified otherwise default prompts will be used.

How Has This Been Tested?

(Unit) Tests: see scripts in backend/tests/multimodal_confluence
Tested with own confluence space and AzureOpenAI (GPT-4o)

Accepted Risk (provide if relevant)

If the summarization of an image fails due to content filters being triggered (or other errors) the indexing of the documents is restarted every x (default 30) minutes if not stopped manually. To avoid this problem we highly recommend setting CONTINUE_ON_CONNECTOR_FAILURE true, so the summaries for such images stay empty.

Related Issue(s) (provide if relevant)

N/A

Mental Checklist:

All of the automated tests pass
All PR comments are addressed and marked resolved
If there are migrations, they have been rebased to latest main
If there are new dependencies, they are added to the requirements
If there are new environment variables, they are added to all of the deployment methods
If there are new APIs that don't require auth, they are added to PUBLIC_ENDPOINT_SPECS
Docker images build and basic functionalities work
Author has done a final read through of the PR right before merge

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

This PR should be backported (make sure to check that the backport attempt succeeds)

…lable

fix mypy issues and remove tbd images

add gliffy images, ignore videos, use async openai

…image_summaries_only

…timodal

…Prompt template

…y fails

…image_summaries_only

vercel · 2024-11-22T11:01:15Z

@nisi99 is attempting to deploy a commit to the Danswer Team on Vercel.

A member of the Team first needs to authorize it.

nisi99 and others added 30 commits September 2, 2024 11:43

fix changes with newest danswer versions

f9027d1

WIP: fill image field

7732578

Merge branch 'danswer-ai:main' into main

4e0befb

add type ignore to some imports - pydantic

25b07e6

out comment access_control_list to fix doc retrieval

e63439b

more type ignoring

d8aebe0

add image field to yql-base

c5ffd7c

add .env variables for Azure

cf1b21e

remove unnecessary logger

e7d29c6

fix add and fill image field

56b031c

remove credentials from comment

3ca26ae

add image field to all other connector to keep app useable

52ac03e

add image field to all other connectors to keep app useable

31fef95

fix image extraction: skip icons

ebbd74c

changed save path for images

b73326e

add exceptions for request errors when image path corrupted/ not avai…

deb39c7

…lable

add logging for retrieved chunks

ad95544

fix path to store images

8e82ed6

skip save images to save storage

b8899d9

remove faulty logging

c124303

added resizing of images if file is bigger than 20mb

867f0b0

fix mypy errors

01b041d

remove more tbd images

555c6ed

restore accidentally deleted trailing commas

53896c3

Merge pull request #1 from hoesler/oge-christoph

1558fc9

fix mypy issues and remove tbd images

add images to metadata and remove image field

03ffe58

add image to answer pipeline

2b638de

small fixes: added OPENAI_API_VERSION and removed some logging

c87a325

only use image if it exists in metadata

e1c3840

add image field to metadata_keys_to_ignore

55917f0

nisi99 and others added 28 commits October 15, 2024 10:46

Merge pull request #6 from hoesler/oge-christoph

fc17c93

add gliffy images, ignore videos, use async openai

add optional custom system prompt

621e159

WIP: fix error with gliffy attachments on restricted pages

40c382b

add optional custom user prompt for image summarization

961f0a6

WIP: fix error while answering

d01f580

add exception for openai BadRequestError (triggered Contentfilter)

e3dfe68

Merge branch 'rebase' into merge_test

8c55fe7

WIP: fix multimodal image feature after rebase

f4d6222

remove raw image handling

8bf4f9c

WIP: cleanup

1933c3d

Merge remote-tracking branch 'upstream/main' into multimodal_feature_…

f0ab87b

…image_summaries_only

revert unnecessary changes

213e498

WIP: cleanup

c731907

Revert some more unnecessary changes

8525abc

revert changes in prompt_utils.py

cc52ee7

revert changes in prompt_utils.py

b18d951

use Danswer Default-LLM to summarize images

65f957c

revert: ignore image tag since it does not exist with summaries only

88b5923

add: raise exception if llm for summarization not provided or not mul…

e0d4dcb

…timodal

stop connector if in multimodal case llm not configured correctly

7cc37d3

rename env vars for confluence image summarization + add XML to User …

c9dc28c

…Prompt template

rename env vars for confluence image summarization

f016c04

add function to check if model supports vision

16c7225

workaround: to prevent an infinite indexing loop when an image summar…

d0eedf0

…y fails

adjust default prompts for confluence summarization

ec274e0

add unit tests for multimodal confluence functions

b212e2d

Merge remote-tracking branch 'upstream/main' into multimodal_feature_…

fc3b3f3

…image_summaries_only

small fixes after merge

58ecd43

nisi99 changed the title ~~Multimodal feature image summaries only~~ Multimodal feature for confluence (image summaries only) Nov 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multimodal feature for confluence (image summaries only) #3208

Multimodal feature for confluence (image summaries only) #3208

nisi99 commented Nov 22, 2024

vercel bot commented Nov 22, 2024

Multimodal feature for confluence (image summaries only) #3208

Are you sure you want to change the base?

Multimodal feature for confluence (image summaries only) #3208

Conversation

nisi99 commented Nov 22, 2024

Description

How Has This Been Tested?

Accepted Risk (provide if relevant)

Related Issue(s) (provide if relevant)

Mental Checklist:

Backporting (check the box to trigger backport action)

vercel bot commented Nov 22, 2024