Skip to content

Commit

Permalink
feat(vectordb): Milvus vector db Integration (#1996)
Browse files Browse the repository at this point in the history
* integrate Milvus into Private GPT

* adjust milvus settings

* update doc info and reformat

* adjust milvus initialization

* adjust import error

* mionr update

* adjust format

* adjust the db storing path

* update doc
  • Loading branch information
Jacksonxhx authored Jul 18, 2024
1 parent 4523a30 commit 43cc31f
Show file tree
Hide file tree
Showing 8 changed files with 173 additions and 6 deletions.
3 changes: 2 additions & 1 deletion fern/docs/pages/installation/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ will load the configuration from `settings.yaml` and `settings-ollama.yaml`.

## About Fully Local Setups
In order to run PrivateGPT in a fully local setup, you will need to run the LLM, Embeddings and Vector Store locally.

### LLM
For local LLM there are two options:
* (Recommended) You can use the 'ollama' option in PrivateGPT, which will connect to your local Ollama instance. Ollama simplifies a lot the installation of local LLMs.
Expand All @@ -63,4 +64,4 @@ In order for HuggingFace LLM to work (the second option), you need to download t
poetry run python scripts/setup
```
### Vector stores
The vector stores supported (Qdrant, ChromaDB and Postgres) run locally by default.
The vector stores supported (Qdrant, Milvus, ChromaDB and Postgres) run locally by default.
1 change: 1 addition & 0 deletions fern/docs/pages/installation/installation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ You need to choose one option per category (LLM, Embeddings, Vector Stores, UI).
| **Option** | **Description** | **Extra** |
|------------------|-----------------------------------------|-------------------------|
| **qdrant** | Adds support for Qdrant vector store | vector-stores-qdrant |
| milvus | Adds support for Milvus vector store | vector-stores-milvus |
| chroma | Adds support for Chroma DB vector store | vector-stores-chroma |
| postgres | Adds support for Postgres vector store | vector-stores-postgres |
| clickhouse | Adds support for Clickhouse vector store| vector-stores-clickhouse|
Expand Down
23 changes: 21 additions & 2 deletions fern/docs/pages/manual/vectordb.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
PrivateGPT supports [Qdrant](https://qdrant.tech/), [Chroma](https://www.trychroma.com/), [PGVector](https://github.com/pgvector/pgvector) and [ClickHouse](https://github.com/ClickHouse/ClickHouse) as vectorstore providers. Qdrant being the default.
## Vectorstores
PrivateGPT supports [Qdrant](https://qdrant.tech/), [Milvus](https://milvus.io/), [Chroma](https://www.trychroma.com/), [PGVector](https://github.com/pgvector/pgvector) and [ClickHouse](https://github.com/ClickHouse/ClickHouse) as vectorstore providers. Qdrant being the default.

In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `chroma`, `postgres` and `clickhouse`.
In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `milvus`, `chroma`, `postgres` and `clickhouse`.

```yaml
vectorstore:
Expand Down Expand Up @@ -38,6 +39,24 @@ qdrant:
path: local_data/private_gpt/qdrant
```

### Milvus configuration

To enable Milvus, set the `vectorstore.database` property in the `settings.yaml` file to `milvus` and install the `milvus` extra.

```bash
poetry install --extras vector-stores-milvus
```

The available configuration options are:
| Field | Description |
|--------------|-------------|
| uri | Default is set to "local_data/private_gpt/milvus/milvus_local.db" as a local file; you can also set up a more performant Milvus server on docker or k8s e.g.http://localhost:19530, as your uri; To use Zilliz Cloud, adjust the uri and token to Endpoint and Api key in Zilliz Cloud.|
| token | Pair with Milvus server on docker or k8s or zilliz cloud api key.|
| collection_name | The name of the collection, set to default "milvus_db".|
| overwrite | Overwrite the data in collection if it existed, set to default as True. |

To obtain a local setup (disk-based database) without running a Milvus server, configure the uri value in settings.yaml, to store in local_data/private_gpt/milvus/milvus_local.db.

### Chroma configuration

To enable Chroma, set the `vectorstore.database` property in the `settings.yaml` file to `chroma` and install the `chroma` extra.
Expand Down
82 changes: 80 additions & 2 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 39 additions & 0 deletions private_gpt/components/vector_store/vector_store_component.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,45 @@ def __init__(self, settings: Settings) -> None:
collection_name="make_this_parameterizable_per_api_call",
), # TODO
)

case "milvus":
try:
from llama_index.vector_stores.milvus import ( # type: ignore
MilvusVectorStore,
)
except ImportError as e:
raise ImportError(
"Milvus dependencies not found, install with `poetry install --extras vector-stores-milvus`"
) from e

if settings.milvus is None:
logger.info(
"Milvus config not found. Using default settings.\n"
"Trying to connect to Milvus at local_data/private_gpt/milvus/milvus_local.db "
"with collection 'make_this_parameterizable_per_api_call'."
)

self.vector_store = typing.cast(
BasePydanticVectorStore,
MilvusVectorStore(
dim=settings.embedding.embed_dim,
collection_name="make_this_parameterizable_per_api_call",
overwrite=True,
),
)

else:
self.vector_store = typing.cast(
BasePydanticVectorStore,
MilvusVectorStore(
dim=settings.embedding.embed_dim,
uri=settings.milvus.uri,
token=settings.milvus.token,
collection_name=settings.milvus.collection_name,
overwrite=settings.milvus.overwrite,
),
)

case "clickhouse":
try:
from clickhouse_connect import ( # type: ignore
Expand Down
24 changes: 23 additions & 1 deletion private_gpt/settings/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ class LLMSettings(BaseModel):


class VectorstoreSettings(BaseModel):
database: Literal["chroma", "qdrant", "postgres", "clickhouse"]
database: Literal["chroma", "qdrant", "postgres", "clickhouse", "milvus"]


class NodeStoreSettings(BaseModel):
Expand Down Expand Up @@ -508,6 +508,27 @@ class QdrantSettings(BaseModel):
)


class MilvusSettings(BaseModel):
uri: str = Field(
"local_data/private_gpt/milvus/milvus_local.db",
description="The URI of the Milvus instance. For example: 'local_data/private_gpt/milvus/milvus_local.db' for Milvus Lite.",
)
token: str = Field(
"",
description=(
"A valid access token to access the specified Milvus instance. "
"This can be used as a recommended alternative to setting user and password separately. "
),
)
collection_name: str = Field(
"make_this_parameterizable_per_api_call",
description="The name of the collection in Milvus. Default is 'make_this_parameterizable_per_api_call'.",
)
overwrite: bool = Field(
True, description="Overwrite the previous collection schema if it exists."
)


class Settings(BaseModel):
server: ServerSettings
data: DataSettings
Expand All @@ -527,6 +548,7 @@ class Settings(BaseModel):
qdrant: QdrantSettings | None = None
postgres: PostgresSettings | None = None
clickhouse: ClickHouseSettings | None = None
milvus: MilvusSettings | None = None


"""
Expand Down
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ llama-index-embeddings-openai = {version ="^0.1.10", optional = true}
llama-index-embeddings-azure-openai = {version ="^0.1.10", optional = true}
llama-index-embeddings-gemini = {version ="^0.1.8", optional = true}
llama-index-vector-stores-qdrant = {version ="^0.2.10", optional = true}
llama-index-vector-stores-milvus = {version ="^0.1.20", optional = true}
llama-index-vector-stores-chroma = {version ="^0.1.10", optional = true}
llama-index-vector-stores-postgres = {version ="^0.1.11", optional = true}
llama-index-vector-stores-clickhouse = {version ="^0.1.3", optional = true}
Expand Down Expand Up @@ -78,6 +79,7 @@ vector-stores-qdrant = ["llama-index-vector-stores-qdrant"]
vector-stores-clickhouse = ["llama-index-vector-stores-clickhouse", "clickhouse_connect"]
vector-stores-chroma = ["llama-index-vector-stores-chroma"]
vector-stores-postgres = ["llama-index-vector-stores-postgres"]
vector-stores-milvus = ["llama-index-vector-stores-milvus"]
storage-nodestore-postgres = ["llama-index-storage-docstore-postgres","llama-index-storage-index-store-postgres","psycopg2-binary","asyncpg"]
rerank-sentence-transformers = ["torch", "sentence-transformers"]

Expand Down
5 changes: 5 additions & 0 deletions settings.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,11 @@ vectorstore:
nodestore:
database: simple

milvus:
uri: local_data/private_gpt/milvus/milvus_local.db
collection_name: milvus_db
overwrite: false

qdrant:
path: local_data/private_gpt/qdrant

Expand Down

0 comments on commit 43cc31f

Please sign in to comment.