Skip to content

Commit

Permalink
Merge branch 'danswer-ai:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
onimsha authored Apr 19, 2024
2 parents 8f6d6ad + e361e92 commit cd43e94
Show file tree
Hide file tree
Showing 147 changed files with 6,252 additions and 1,739 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -38,5 +38,7 @@ jobs:
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
# To run locally: trivy image --severity HIGH,CRITICAL danswer/danswer-backend
image-ref: docker.io/danswer/danswer-backend:${{ github.ref_name }}
severity: 'CRITICAL,HIGH'
trivyignores: ./backend/.trivyignore
2 changes: 2 additions & 0 deletions .github/workflows/pr-python-checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,12 @@ jobs:
cache-dependency-path: |
backend/requirements/default.txt
backend/requirements/dev.txt
backend/requirements/model_server.txt
- run: |
python -m pip install --upgrade pip
pip install -r backend/requirements/default.txt
pip install -r backend/requirements/dev.txt
pip install -r backend/requirements/model_server.txt
- name: Run MyPy
run: |
Expand Down
38 changes: 14 additions & 24 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ Install the required python dependencies:
```bash
pip install -r danswer/backend/requirements/default.txt
pip install -r danswer/backend/requirements/dev.txt
pip install -r danswer/backend/requirements/model_server.txt
```

Install [Node.js and npm](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm) for the frontend.
Expand Down Expand Up @@ -112,26 +113,24 @@ docker compose -f docker-compose.dev.yml -p danswer-stack up -d index relational
(index refers to Vespa and relational_db refers to Postgres)

#### Running Danswer

Setup a folder to store config. Navigate to `danswer/backend` and run:
```bash
mkdir dynamic_config_storage
```

To start the frontend, navigate to `danswer/web` and run:
```bash
npm run dev
```

Package the Vespa schema. This will only need to be done when the Vespa schema is updated locally.

Navigate to `danswer/backend/danswer/document_index/vespa/app_config` and run:
Next, start the model server which runs the local NLP models.
Navigate to `danswer/backend` and run:
```bash
zip -r ../vespa-app.zip .
uvicorn model_server.main:app --reload --port 9000
```
_For Windows (for compatibility with both PowerShell and Command Prompt):_
```bash
powershell -Command "
uvicorn model_server.main:app --reload --port 9000
"
```
- Note: If you don't have the `zip` utility, you will need to install it prior to running the above

The first time running Danswer, you will also need to run the DB migrations for Postgres.
The first time running Danswer, you will need to run the DB migrations for Postgres.
After the first time, this is no longer required unless the DB models change.

Navigate to `danswer/backend` and with the venv active, run:
Expand All @@ -149,17 +148,12 @@ python ./scripts/dev_run_background_jobs.py

To run the backend API server, navigate back to `danswer/backend` and run:
```bash
AUTH_TYPE=disabled \
DYNAMIC_CONFIG_DIR_PATH=./dynamic_config_storage \
VESPA_DEPLOYMENT_ZIP=./danswer/document_index/vespa/vespa-app.zip \
uvicorn danswer.main:app --reload --port 8080
AUTH_TYPE=disabled uvicorn danswer.main:app --reload --port 8080
```
_For Windows (for compatibility with both PowerShell and Command Prompt):_
```bash
powershell -Command "
$env:AUTH_TYPE='disabled'
$env:DYNAMIC_CONFIG_DIR_PATH='./dynamic_config_storage'
$env:VESPA_DEPLOYMENT_ZIP='./danswer/document_index/vespa/vespa-app.zip'
uvicorn danswer.main:app --reload --port 8080
"
```
Expand All @@ -178,20 +172,16 @@ pre-commit install

Additionally, we use `mypy` for static type checking.
Danswer is fully type-annotated, and we would like to keep it that way!
Right now, there is no automated type checking at the moment (coming soon), but we ask you to manually run it before
creating a pull requests with `python -m mypy .` from the `danswer/backend` directory.
To run the mypy checks manually, run `python -m mypy .` from the `danswer/backend` directory.


#### Web
We use `prettier` for formatting. The desired version (2.8.8) will be installed via a `npm i` from the `danswer/web` directory.
To run the formatter, use `npx prettier --write .` from the `danswer/web` directory.
Like `mypy`, we have no automated formatting yet (coming soon), but we request that, for now,
you run this manually before creating a pull request.
Please double check that prettier passes before creating a pull request.


### Release Process
Danswer follows the semver versioning standard.
A set of Docker containers will be pushed automatically to DockerHub with every tag.
You can see the containers [here](https://hub.docker.com/search?q=danswer%2F).

As pre-1.0 software, even patch releases may contain breaking or non-backwards-compatible changes.
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,12 @@
</a>
</p>

<strong>[Danswer](https://www.danswer.ai/)</strong> is the ChatGPT for teams. Danswer provides a Chat interface and plugs into any LLM of
your choice. Danswer can be deployed anywhere and for any scale - on a laptop, on-premise, or to cloud. Since you own
the deployment, your user data and chats are fully in your own control. Danswer is MIT licensed and designed to be
modular and easily extensible. The system also comes fully ready for production usage with user authentication, role
management (admin/basic users), chat persistence, and a UI for configuring Personas (AI Assistants) and their Prompts.
<strong>[Danswer](https://www.danswer.ai/)</strong> is the AI Assistant connected to your company's docs, apps, and people.
Danswer provides a Chat interface and plugs into any LLM of your choice. Danswer can be deployed anywhere and for any
scale - on a laptop, on-premise, or to cloud. Since you own the deployment, your user data and chats are fully in your
own control. Danswer is MIT licensed and designed to be modular and easily extensible. The system also comes fully ready
for production usage with user authentication, role management (admin/basic users), chat persistence, and a UI for
configuring Personas (AI Assistants) and their Prompts.

Danswer also serves as a Unified Search across all common workplace tools such as Slack, Google Drive, Confluence, etc.
By combining LLMs and team specific knowledge, Danswer becomes a subject matter expert for the team. Imagine ChatGPT if
Expand Down
46 changes: 46 additions & 0 deletions backend/.trivyignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# https://github.com/madler/zlib/issues/868
# Pulled in with base Debian image, it's part of the contrib folder but unused
# zlib1g is fine
# Will be gone with Debian image upgrade
# No impact in our settings
CVE-2023-45853

# krb5 related, worst case is denial of service by resource exhaustion
# Accept the risk
CVE-2024-26458
CVE-2024-26461
CVE-2024-26462
CVE-2024-26458
CVE-2024-26461
CVE-2024-26462
CVE-2024-26458
CVE-2024-26461
CVE-2024-26462
CVE-2024-26458
CVE-2024-26461
CVE-2024-26462

# Specific to Firefox which we do not use
# No impact in our settings
CVE-2024-0743

# bind9 related, worst case is denial of service by CPU resource exhaustion
# Accept the risk
CVE-2023-50387
CVE-2023-50868
CVE-2023-50387
CVE-2023-50868

# libexpat1, XML parsing resource exhaustion
# We don't parse any user provided XMLs
# No impact in our settings
CVE-2023-52425
CVE-2024-28757

# sqlite, only used by NLTK library to grab word lemmatizer and stopwords
# No impact in our settings
CVE-2023-7104

# libharfbuzz0b, O(n^2) growth, worst case is denial of service
# Accept the risk
CVE-2023-25193
14 changes: 11 additions & 3 deletions backend/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
FROM python:3.11.7-slim-bookworm

LABEL com.danswer.maintainer="founders@danswer.ai"
LABEL com.danswer.description="This image is for the backend of Danswer. It is MIT Licensed and \
free for all to use. You can find it at https://hub.docker.com/r/danswer/danswer-backend. For \
more details, visit https://github.com/danswer-ai/danswer."

# Default DANSWER_VERSION, typically overriden during builds by GitHub Actions.
ARG DANSWER_VERSION=0.3-dev
ENV DANSWER_VERSION=${DANSWER_VERSION}
Expand All @@ -12,7 +17,9 @@ RUN echo "DANSWER_VERSION: ${DANSWER_VERSION}"
# zip for Vespa step futher down
# ca-certificates for HTTPS
RUN apt-get update && \
apt-get install -y cmake curl zip ca-certificates libgnutls30=3.7.9-2+deb12u2 && \
apt-get install -y cmake curl zip ca-certificates libgnutls30=3.7.9-2+deb12u2 \
libblkid1=2.38.1-5+deb12u1 libmount1=2.38.1-5+deb12u1 libsmartcols1=2.38.1-5+deb12u1 \
libuuid1=2.38.1-5+deb12u1 && \
rm -rf /var/lib/apt/lists/* && \
apt-get clean

Expand All @@ -29,15 +36,16 @@ RUN pip install --no-cache-dir --upgrade -r /tmp/requirements.txt && \
# xserver-common and xvfb included by playwright installation but not needed after
# perl-base is part of the base Python Debian image but not needed for Danswer functionality
# perl-base could only be removed with --allow-remove-essential
RUN apt-get remove -y --allow-remove-essential perl-base xserver-common xvfb cmake libldap-2.5-0 libldap-2.5-0 && \
RUN apt-get remove -y --allow-remove-essential perl-base xserver-common xvfb cmake \
libldap-2.5-0 libldap-2.5-0 && \
apt-get autoremove -y && \
rm -rf /var/lib/apt/lists/* && \
rm /usr/local/lib/python3.11/site-packages/tornado/test/test.key

# Set up application files
WORKDIR /app
COPY ./danswer /app/danswer
COPY ./shared_models /app/shared_models
COPY ./shared_configs /app/shared_configs
COPY ./alembic /app/alembic
COPY ./alembic.ini /app/alembic.ini
COPY supervisord.conf /usr/etc/supervisord.conf
Expand Down
19 changes: 8 additions & 11 deletions backend/Dockerfile.model_server
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
FROM python:3.11.7-slim-bookworm

LABEL com.danswer.maintainer="founders@danswer.ai"
LABEL com.danswer.description="This image is for the Danswer model server which runs all of the \
AI models for Danswer. This container and all the code is MIT Licensed and free for all to use. \
You can find it at https://hub.docker.com/r/danswer/danswer-model-server. For more details, \
visit https://github.com/danswer-ai/danswer."

# Default DANSWER_VERSION, typically overriden during builds by GitHub Actions.
ARG DANSWER_VERSION=0.3-dev
ENV DANSWER_VERSION=${DANSWER_VERSION}
Expand All @@ -13,23 +19,14 @@ RUN apt-get remove -y --allow-remove-essential perl-base && \

WORKDIR /app

# Needed for model configs and defaults
COPY ./danswer/configs /app/danswer/configs
COPY ./danswer/dynamic_configs /app/danswer/dynamic_configs

# Utils used by model server
COPY ./danswer/utils/logger.py /app/danswer/utils/logger.py
COPY ./danswer/utils/timing.py /app/danswer/utils/timing.py
COPY ./danswer/utils/telemetry.py /app/danswer/utils/telemetry.py

# Place to fetch version information
COPY ./danswer/__init__.py /app/danswer/__init__.py

# Shared implementations for running NLP models locally
COPY ./danswer/search/search_nlp_models.py /app/danswer/search/search_nlp_models.py

# Request/Response models
COPY ./shared_models /app/shared_models
# Shared between Danswer Backend and Model Server
COPY ./shared_configs /app/shared_configs

# Model Server main code
COPY ./model_server /app/model_server
Expand Down
41 changes: 41 additions & 0 deletions backend/alembic/versions/38eda64af7fe_add_chat_session_sharing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
"""Add chat session sharing
Revision ID: 38eda64af7fe
Revises: 776b3bbe9092
Create Date: 2024-03-27 19:41:29.073594
"""
from alembic import op
import sqlalchemy as sa

# revision identifiers, used by Alembic.
revision = "38eda64af7fe"
down_revision = "776b3bbe9092"
branch_labels = None
depends_on = None


def upgrade() -> None:
op.add_column(
"chat_session",
sa.Column(
"shared_status",
sa.Enum(
"PUBLIC",
"PRIVATE",
name="chatsessionsharedstatus",
native_enum=False,
),
nullable=True,
),
)
op.execute("UPDATE chat_session SET shared_status='PRIVATE'")
op.alter_column(
"chat_session",
"shared_status",
nullable=False,
)


def downgrade() -> None:
op.drop_column("chat_session", "shared_status")
23 changes: 23 additions & 0 deletions backend/alembic/versions/475fcefe8826_add_name_to_api_key.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
"""Add name to api_key
Revision ID: 475fcefe8826
Revises: ecab2b3f1a3b
Create Date: 2024-04-11 11:05:18.414438
"""
from alembic import op
import sqlalchemy as sa

# revision identifiers, used by Alembic.
revision = "475fcefe8826"
down_revision = "ecab2b3f1a3b"
branch_labels = None
depends_on = None


def upgrade() -> None:
op.add_column("api_key", sa.Column("name", sa.String(), nullable=True))


def downgrade() -> None:
op.drop_column("api_key", "name")
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
"""Permission Auto Sync Framework
Revision ID: 72bdc9929a46
Revises: 475fcefe8826
Create Date: 2024-04-14 21:15:28.659634
"""
from alembic import op
import sqlalchemy as sa

# revision identifiers, used by Alembic.
revision = "72bdc9929a46"
down_revision = "475fcefe8826"
branch_labels = None
depends_on = None


def upgrade() -> None:
op.create_table(
"email_to_external_user_cache",
sa.Column("id", sa.Integer(), nullable=False),
sa.Column("external_user_id", sa.String(), nullable=False),
sa.Column("user_id", sa.UUID(), nullable=True),
sa.Column("user_email", sa.String(), nullable=False),
sa.ForeignKeyConstraint(
["user_id"],
["user.id"],
),
sa.PrimaryKeyConstraint("id"),
)
op.create_table(
"external_permission",
sa.Column("id", sa.Integer(), nullable=False),
sa.Column("user_id", sa.UUID(), nullable=True),
sa.Column("user_email", sa.String(), nullable=False),
sa.Column(
"source_type",
sa.String(),
nullable=False,
),
sa.Column("external_permission_group", sa.String(), nullable=False),
sa.ForeignKeyConstraint(
["user_id"],
["user.id"],
),
sa.PrimaryKeyConstraint("id"),
)
op.create_table(
"permission_sync_run",
sa.Column("id", sa.Integer(), nullable=False),
sa.Column(
"source_type",
sa.String(),
nullable=False,
),
sa.Column("update_type", sa.String(), nullable=False),
sa.Column("cc_pair_id", sa.Integer(), nullable=True),
sa.Column(
"status",
sa.String(),
nullable=False,
),
sa.Column("error_msg", sa.Text(), nullable=True),
sa.Column(
"updated_at",
sa.DateTime(timezone=True),
server_default=sa.text("now()"),
nullable=False,
),
sa.ForeignKeyConstraint(
["cc_pair_id"],
["connector_credential_pair.id"],
),
sa.PrimaryKeyConstraint("id"),
)


def downgrade() -> None:
op.drop_table("permission_sync_run")
op.drop_table("external_permission")
op.drop_table("email_to_external_user_cache")
Loading

0 comments on commit cd43e94

Please sign in to comment.